This function is expected to be called from path-based syscalls to help
them decide whether to try the lookup and call again in the event that
they got an -ESTALE return back on an earier try.
Currently, we only retry the call once on an ESTALE error, but in the
event that we decide that that's not enough in the future, we should be
able to change the logic in this helper without too much effort.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Commit 8e22cc88d6 removes the (un)lock_super
function definitions but forgets to remove their prototypes.
Signed-off-by: Alessio Igor Bogani <abogani@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Mark as cancelled an operation that is in progress rather than pending at the
time it is cancelled, and call fscache_complete_op() to cancel an operation so
that blocked ops can be started.
Signed-off-by: David Howells <dhowells@redhat.com>
Convert the fscache_object event IDs from #defines into an enum. Also add an
extra label to the enum to carry the event count and redefine the event mask
in terms of that.
Signed-off-by: David Howells <dhowells@redhat.com>
Make a more complete truncate operation available to CacheFiles (including
security checks and suchlike) so that it can use this to clear invalidated
cache files.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Pull nfsd update from Bruce Fields:
"Included this time:
- more nfsd containerization work from Stanislav Kinsbursky: we're
not quite there yet, but should be by 3.9.
- NFSv4.1 progress: implementation of basic backchannel security
negotiation and the mandatory BACKCHANNEL_CTL operation. See
http://wiki.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues
for remaining TODO's
- Fixes for some bugs that could be triggered by unusual compounds.
Our xdr code wasn't designed with v4 compounds in mind, and it
shows. A more thorough rewrite is still a todo.
- If you've ever seen "RPC: multiple fragments per record not
supported" logged while using some sort of odd userland NFS client,
that should now be fixed.
- Further work from Jeff Layton on our mechanism for storing
information about NFSv4 clients across reboots.
- Further work from Bryan Schumaker on his fault-injection mechanism
(which allows us to discard selective NFSv4 state, to excercise
rarely-taken recovery code paths in the client.)
- The usual mix of miscellaneous bugs and cleanup.
Thanks to everyone who tested or contributed this cycle."
* 'for-3.8' of git://linux-nfs.org/~bfields/linux: (111 commits)
nfsd4: don't leave freed stateid hashed
nfsd4: free_stateid can use the current stateid
nfsd4: cleanup: replace rq_resused count by rq_next_page pointer
nfsd: warn on odd reply state in nfsd_vfs_read
nfsd4: fix oops on unusual readlike compound
nfsd4: disable zero-copy on non-final read ops
svcrpc: fix some printks
NFSD: Correct the size calculation in fault_inject_write
NFSD: Pass correct buffer size to rpc_ntop
nfsd: pass proper net to nfsd_destroy() from NFSd kthreads
nfsd: simplify service shutdown
nfsd: replace boolean nfsd_up flag by users counter
nfsd: simplify NFSv4 state init and shutdown
nfsd: introduce helpers for generic resources init and shutdown
nfsd: make NFSd service structure allocated per net
nfsd: make NFSd service boot time per-net
nfsd: per-net NFSd up flag introduced
nfsd: move per-net startup code to separated function
nfsd: pass net to __write_ports() and down
nfsd: pass net to nfsd_set_nrthreads()
...
Provide a proper invalidation method rather than relying on the netfs retiring
the cookie it has and getting a new one. The problem with this is that isn't
easy for the netfs to make sure that it has completed/cancelled all its
outstanding storage and retrieval operations on the cookie it is retiring.
Instead, have the cache provide an invalidation method that will cancel or wait
for all currently outstanding operations before invalidating the cache, and
will cause new operations to queue up behind that. Whilst invalidation is in
progress, some requests will be rejected until the cache can stack a barrier on
the operation queue to cause new operations to be deferred behind it.
Signed-off-by: David Howells <dhowells@redhat.com>
Pull Ceph update from Sage Weil:
"There are a few different groups of commits here. The largest is
Alex's ongoing work to enable the coming RBD features (cloning,
striping). There is some cleanup in libceph that goes along with it.
Cyril and David have fixed some problems with NFS reexport (leaking
dentries and page locks), and there is a batch of patches from Yan
fixing problems with the fs client when running against a clustered
MDS. There are a few bug fixes mixed in for good measure, many of
which will be going to the stable trees once they're upstream.
My apologies for the late pull. There is still a gremlin in the rbd
map/unmap code and I was hoping to include the fix for that as well,
but we haven't been able to confirm the fix is correct yet; I'll send
that in a separate pull once it's nailed down."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (68 commits)
rbd: get rid of rbd_{get,put}_dev()
libceph: register request before unregister linger
libceph: don't use rb_init_node() in ceph_osdc_alloc_request()
libceph: init event->node in ceph_osdc_create_event()
libceph: init osd->o_node in create_osd()
libceph: report connection fault with warning
libceph: socket can close in any connection state
rbd: don't use ENOTSUPP
rbd: remove linger unconditionally
rbd: get rid of RBD_MAX_SEG_NAME_LEN
libceph: avoid using freed osd in __kick_osd_requests()
ceph: don't reference req after put
rbd: do not allow remove of mounted-on image
libceph: Unlock unprocessed pages in start_read() error path
ceph: call handle_cap_grant() for cap import message
ceph: Fix __ceph_do_pending_vmtruncate
ceph: Don't add dirty inode to dirty list if caps is in migration
ceph: Fix infinite loop in __wake_requests
ceph: Don't update i_max_size when handling non-auth cap
bdi_register: add __printf verification, fix arg mismatch
...
Fix the state management of internal fscache operations and the accounting of
what operations are in what states.
This is done by:
(1) Give struct fscache_operation a enum variable that directly represents the
state it's currently in, rather than spreading this knowledge over a bunch
of flags, who's processing the operation at the moment and whether it is
queued or not.
This makes it easier to write assertions to check the state at various
points and to prevent invalid state transitions.
(2) Add an 'operation complete' state and supply a function to indicate the
completion of an operation (fscache_op_complete()) and make things call
it. The final call to fscache_put_operation() can then check that an op
in the appropriate state (complete or cancelled).
(3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better
govern the state of an object:
(a) The ->n_ops is now the number of extant operations on the object
and is now decremented by fscache_put_operation() only.
(b) The ->n_in_progress is simply the number of objects that have been
taken off of the object's pending queue for the purposes of being
run. This is decremented by fscache_op_complete() only.
(c) The ->n_exclusive is the number of exclusive ops that have been
submitted and queued or are in progress. It is decremented by
fscache_op_complete() and by fscache_cancel_op().
fscache_put_operation() and fscache_operation_gc() now no longer try to
clean up ->n_exclusive and ->n_in_progress. That was leading to double
decrements against fscache_cancel_op().
fscache_cancel_op() now no longer decrements ->n_ops. That was leading to
double decrements against fscache_put_operation().
fscache_submit_exclusive_op() now decides whether it has to queue an op
based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter
will persist in being true even after all preceding operations have been
cancelled or completed. Furthermore, if an object is active and there are
runnable ops against it, there must be at least one op running.
(4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and
provide a function to record completion of the pages as they complete.
When n_pages reaches 0, the operation is deemed to be complete and
fscache_op_complete() is called.
Add calls to fscache_retrieval_complete() anywhere we've finished with a
page we've been given to read or allocate for. This includes places where
we just return pages to the netfs for reading from the server and where
accessing the cache fails and we discard the proposed netfs page.
The bugs in the unfixed state management manifest themselves as oopses like the
following where the operation completion gets out of sync with return of the
cookie by the netfs. This is possible because the cache unlocks and returns
all the netfs pages before recording its completion - which means that there's
nothing to stop the netfs discarding them and returning the cookie.
FS-Cache: Cookie 'NFS.fh' still has outstanding reads
------------[ cut here ]------------
kernel BUG at fs/fscache/cookie.c:519!
invalid opcode: 0000 [#1] SMP
CPU 1
Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc
Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY
RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache]
RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282
RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000
RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c
RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000
R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98
R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370
FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040)
Stack:
ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0
ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0
ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91
Call Trace:
[<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs]
[<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs]
[<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs]
[<ffffffff810d8d47>] evict+0xa1/0x15c
[<ffffffff810d8e2e>] dispose_list+0x2c/0x38
[<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b
[<ffffffff810c56b7>] prune_super+0xd5/0x140
[<ffffffff8109b615>] shrink_slab+0x102/0x1ab
[<ffffffff8109d690>] balance_pgdat+0x2f2/0x595
[<ffffffff8103e009>] ? process_timeout+0xb/0xb
[<ffffffff8109dba3>] kswapd+0x270/0x289
[<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46
[<ffffffff8109d933>] ? balance_pgdat+0x595/0x595
[<ffffffff8104bf7a>] kthread+0x7f/0x87
[<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10
[<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0
[<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe
[<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53
[<ffffffff813ad6b0>] ? gs_change+0xb/0xb
Signed-off-by: David Howells <dhowells@redhat.com>
Make fscache_relinquish_cookie() log a warning and wait if there are any
outstanding reads left on the cookie it was given.
Signed-off-by: David Howells <dhowells@redhat.com>
Highlights:
- Add initial f2fs source codes
- Fix an endian conversion bug
- Fix build failures on random configs
- Fix the power-off-recovery routine
- Minor cleanup, coding style, and typos patches
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQxuJcAAoJEEAUqH6CSFDSq80QAI3i7NgUkx4h225MnbJdEKRb
YX1MfSPmgE0q/15XS2qQu/s9NGJmXLV1IR9EtRSBlCQjwWhbx9Q9URktGkWslFnx
6mBLy8EvVKDMVdwoUS8ZY6IjfKbmSnoIHTZrGaT9+9d7k8nlOQLaj3qQF4wBuw1+
+qhJQV642v8qw7JiVVFgxcBSLpAS9cbdOA0vxfWncMwmRLaEO45W5+rob8ZN8WaS
BUiYIiue8vlB0VDIYfpl/sSPJC/Bn1XsLKZoS2WJl8CKioE1ptLjT3acUBbabUxp
hNLl8Ae0PylDMFpH8hrBXhleznrVqEMOTos/Z80/UyBny2sCxJFnaQ60TayUo2l2
hYk5Wbyj8K7IBJEke23Fepild2PnGz22zf2v+tLxxVgPH5j7/l2XHfy9gPvDbd1P
4ENiJUC3LE49Mi4TvEIFqhbrcJfD9C+v3bxpWGe8CevrpYZaB8tv/6nQXJCC/Ixp
tMWqLKlHyXGmk5DZpiSFaj0/GbTPT0UGqZVRzzSXQpKqxJU6eTnXDa6aLUEYH8fH
grOCriaJrd8SgL3l7RokQSQEyRHuNjMm1tlUQWOObE+y0nJjWb9Amwn1yUtJuNzx
Np4nnlMhxwJ48P3LeeheSCuOUbxJtOzOR8MVXm7deYiGQbYaqB1/+9TbjOZBSX4O
fpbCXrmqe1pUBukftZsL
=iMoX
-----END PGP SIGNATURE-----
Merge tag 'for-3.8-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull new F2FS filesystem from Jaegeuk Kim:
"Introduce a new file system, Flash-Friendly File System (F2FS), to
Linux 3.8.
Highlights:
- Add initial f2fs source codes
- Fix an endian conversion bug
- Fix build failures on random configs
- Fix the power-off-recovery routine
- Minor cleanup, coding style, and typos patches"
From the Kconfig help text:
F2FS is based on Log-structured File System (LFS), which supports
versatile "flash-friendly" features. The design has been focused on
addressing the fundamental issues in LFS, which are snowball effect
of wandering tree and high cleaning overhead.
Since flash-based storages show different characteristics according to
the internal geometry or flash memory management schemes aka FTL, F2FS
and tools support various parameters not only for configuring on-disk
layout, but also for selecting allocation and cleaning algorithms.
and there's an article by Neil Brown about it on lwn.net:
http://lwn.net/Articles/518988/
* tag 'for-3.8-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (36 commits)
f2fs: fix tracking parent inode number
f2fs: cleanup the f2fs_bio_alloc routine
f2fs: introduce accessor to retrieve number of dentry slots
f2fs: remove redundant call to f2fs_put_page in delete entry
f2fs: make use of GFP_F2FS_ZERO for setting gfp_mask
f2fs: rewrite f2fs_bio_alloc to make it simpler
f2fs: fix a typo in f2fs documentation
f2fs: remove unused variable
f2fs: move error condition for mkdir at proper place
f2fs: remove unneeded initialization
f2fs: check read only condition before beginning write out
f2fs: remove unneeded memset from init_once
f2fs: show error in case of invalid mount arguments
f2fs: fix the compiler warning for uninitialized use of variable
f2fs: resolve build failures
f2fs: adjust kernel coding style
f2fs: fix endian conversion bugs reported by sparse
f2fs: remove unneeded version.h header file from f2fs.h
f2fs: update the f2fs document
f2fs: update Kconfig and Makefile
...
Under some circumstances CacheFiles defers the marking of pages with PG_fscache
so that it can take advantage of pagevecs to reduce the number of calls to
fscache_mark_pages_cached() and the netfs's hook to keep track of this.
There are, however, two problems with this:
(1) It can lead to the PG_fscache mark being applied _after_ the page is set
PG_uptodate and unlocked (by the call to fscache_end_io()).
(2) CacheFiles's ref on the page is dropped immediately following
fscache_end_io() - and so may not still be held when the mark is applied.
This can lead to the page being passed back to the allocator before the
mark is applied.
Fix this by, where appropriate, marking the page before calling
fscache_end_io() and releasing the page. This means that we can't take
advantage of pagevecs and have to make a separate call for each page to the
marking routines.
The symptoms of this are Bad Page state errors cropping up under memory
pressure, for example:
BUG: Bad page state in process tar pfn:002da
page:ffffea0000009fb0 count:0 mapcount:0 mapping: (null) index:0x1447
page flags: 0x1000(private_2)
Pid: 4574, comm: tar Tainted: G W 3.1.0-rc4-fsdevel+ #1064
Call Trace:
[<ffffffff8109583c>] ? dump_page+0xb9/0xbe
[<ffffffff81095916>] bad_page+0xd5/0xea
[<ffffffff81095d82>] get_page_from_freelist+0x35b/0x46a
[<ffffffff810961f3>] __alloc_pages_nodemask+0x362/0x662
[<ffffffff810989da>] __do_page_cache_readahead+0x13a/0x267
[<ffffffff81098942>] ? __do_page_cache_readahead+0xa2/0x267
[<ffffffff81098d7b>] ra_submit+0x1c/0x20
[<ffffffff8109900a>] ondemand_readahead+0x28b/0x29a
[<ffffffff81098ee2>] ? ondemand_readahead+0x163/0x29a
[<ffffffff810990ce>] page_cache_sync_readahead+0x38/0x3a
[<ffffffff81091d8a>] generic_file_aio_read+0x2ab/0x67e
[<ffffffffa008cfbe>] nfs_file_read+0xa4/0xc9 [nfs]
[<ffffffff810c22c4>] do_sync_read+0xba/0xfa
[<ffffffff81177a47>] ? security_file_permission+0x7b/0x84
[<ffffffff810c25dd>] ? rw_verify_area+0xab/0xc8
[<ffffffff810c29a4>] vfs_read+0xaa/0x13a
[<ffffffff810c2a79>] sys_read+0x45/0x6c
[<ffffffff813ac37b>] system_call_fastpath+0x16/0x1b
As can be seen, PG_private_2 (== PG_fscache) is set in the page flags.
Instrumenting fscache_mark_pages_cached() to verify whether page->mapping was
set appropriately showed that sometimes it wasn't. This led to the discovery
that sometimes the page has apparently been reclaimed by the time the marker
got to see it.
Reported-by: M. Stevens <m@tippett.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
The code that relied on that flag was ripped out of btrfs quite some
time ago, and never added back. Josef indicated that he was going to
take a different approach to the problem in btrfs, and that we
could just eliminate this flag.
Cc: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
A few new features this merge-window. The most important one is
probably, that dma-debug now warns if a dma-handle is not checked with
dma_mapping_error by the device driver. This requires minor changes to
some architectures which make use of dma-debug. Most of these changes
have the respective Acks by the Arch-Maintainers.
Besides that there are updates to the AMD IOMMU driver for refactor the
IOMMU-Groups support and to make sure it does not trigger a hardware
erratum.
The OMAP changes (for which I pulled in a branch from Tony Lindgren's
tree) have a conflict in linux-next with the arm-soc tree. The conflict
is in the file arch/arm/mach-omap2/clock44xx_data.c which is deleted in
the arm-soc tree. It is safe to delete the file too so solve the
conflict. Similar changes are done in the arm-soc tree in the common
clock framework migration. A missing hunk from the patch in the IOMMU
tree will be submitted as a seperate patch when the merge-window is
closed.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQzbQQAAoJECvwRC2XARrjXCIP/2RxBzbVOiaPOorl+ZWbsZ41
lzWiXsCHJkh4BK4/qGsVeKhiNd9LcbQUlhywnBbhWxym3spzmjGtvU2Hcg8QiO/M
R83r9S4e8Z6DnF9Gcats1Ns9BufgpyhLXg3XoXPxtyHOgRS59fvYi6xXOxyX30Dy
uhbj+WL6UD0zvOMNztEnM1p6UhX+XlpvzKDTR5+G5xKdVPkcgeiaKSwqz739caTn
QE2NpqIh+8Mwuu1nIapk8h07xhUYU5eGMXa38u1LvDwSHsrsCMLC+lXIjtInn7Gw
Bv+XcCHgtOaoPQwwk/xd2HVwJQxO9HNb5YX51EIjwP0C5S/3yW9Ji1RgqFb6Ewqq
jIkF6ckwUheLWsBGkw5UknI/f7RX3MDiTWkziYLIniYKKewm+ymGfgIqPt2TzLIO
tMZZiIssKvy7wOXQ5JjpYJg5Xmrau6opNwdEguC8pWkJT7qsn+3SeLjMt0Lh9IoY
+37DOgOLb3O3/vnZJ3i0KMRZBfVeaRj5HaGmlxFCYUZCNQymIPTih9Jtqm+WuVcu
YaGQCTtynsQ0JVh8YEekLzSfgd3OODP68fyCg1CQNixEgvUi2hd/toX2/Z1wkkSA
JC9bZarcoPkSWqaTAA2HvmaaxvRR+0UbhFPopFTQarVV0MVLZWBxoyuKy/nMrmMd
UgTzrDYy74UKdrSTwIXg
=pPHZ
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
"A few new features this merge-window. The most important one is
probably, that dma-debug now warns if a dma-handle is not checked with
dma_mapping_error by the device driver. This requires minor changes
to some architectures which make use of dma-debug. Most of these
changes have the respective Acks by the Arch-Maintainers.
Besides that there are updates to the AMD IOMMU driver for refactor
the IOMMU-Groups support and to make sure it does not trigger a
hardware erratum.
The OMAP changes (for which I pulled in a branch from Tony Lindgren's
tree) have a conflict in linux-next with the arm-soc tree. The
conflict is in the file arch/arm/mach-omap2/clock44xx_data.c which is
deleted in the arm-soc tree. It is safe to delete the file too so
solve the conflict. Similar changes are done in the arm-soc tree in
the common clock framework migration. A missing hunk from the patch
in the IOMMU tree will be submitted as a seperate patch when the
merge-window is closed."
* tag 'iommu-updates-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (29 commits)
ARM: dma-mapping: support debug_dma_mapping_error
ARM: OMAP4: hwmod data: ipu and dsp to use parent clocks instead of leaf clocks
iommu/omap: Adapt to runtime pm
iommu/omap: Migrate to hwmod framework
iommu/omap: Keep mmu enabled when requested
iommu/omap: Remove redundant clock handling on ISR
iommu/amd: Remove obsolete comment
iommu/amd: Don't use 512GB pages
iommu/tegra: smmu: Move bus_set_iommu after probe for multi arch
iommu/tegra: gart: Move bus_set_iommu after probe for multi arch
iommu/tegra: smmu: Remove unnecessary PTC/TLB flush all
tile: dma_debug: add debug_dma_mapping_error support
sh: dma_debug: add debug_dma_mapping_error support
powerpc: dma_debug: add debug_dma_mapping_error support
mips: dma_debug: add debug_dma_mapping_error support
microblaze: dma-mapping: support debug_dma_mapping_error
ia64: dma_debug: add debug_dma_mapping_error support
c6x: dma_debug: add debug_dma_mapping_error support
ARM64: dma_debug: add debug_dma_mapping_error support
intel-iommu: Prevent devices with RMRRs from being placed into SI Domain
...
Latinoware 2012.
There's a slightly non-trivial merge in virtio-net, as we cleaned up the
virtio add_buf interface while DaveM accepted the mq virtio-net patches.
You can see my solution in my pending-rebases branch, if that helps, but I
know you love merging:
https://git.kernel.org/?p=linux/kernel/git/rusty/linux.git;a=commit;h=12e4e64fa66a4c812e4855de32abdb4d819526fe
Cheers,
Rusty.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQz/vKAAoJENkgDmzRrbjx+eYQAK/egj9T8Nnth6mkzdbCFSO7
Bciga2hDiudGCiGojTRGPRSc0VP9LgfvPbY2pxX+R9CfEqR+a8q/rRQhCS79ZwPB
/mJy3HNiCx418HZxgwNtk6vPe0PjJm6SsjbXeB9hB+PQLCbdwA0BjpG6xjF/jitP
noPqhhXreeQgYVxAKoFPvff/Byu2GlNnDdVMQxWRmo8hTKlTCzl0T/7BHRxthhJj
iOrXTFzrT/osPT0zyqlngT03T4wlBvL2Bfw8d/kuRPEZ71dpIctWeH2KzdwXVCrz
hFQGxAz4OWvW3xrNwj7c6O3SWj4VemUMjQqeA/PtRiOEI5gM0Y/Bit47dWL4wM/O
OWUKFHzq4DFs8MmwXBgDDXl5xOjOBH9Ik4FZayn3Y7COT/B8CjFdOC2MdDGmZ9yd
NInumg7FqP+u12g+9Vq8S/b0cfoQm4qFe8VHiPJu+jRmCZglyvLjk7oq/QwW8Gaq
Pkzit1Ey0DWo2KvZ4D/nuXJCuhmzN/AJ10M48lLYZhtOIVg9gsa0xjhfgq4FnvSK
xFCf3rcWnlGIXcOYh/hKU25WaCLzBuqMuSK35A72IujrQOL7OJTk4Oqote3Z3H9B
08XJmyW6SOZdfw17X4Im1jbyuLek///xQJ9Jw/tya7j9lBt8zjJ+FmLPs4mLGEOm
WJv9uZPs+QbIMNky2Lcb
=myDR
-----END PGP SIGNATURE-----
Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull virtio update from Rusty Russell:
"Some nice cleanups, and even a patch my wife did as a "live" demo for
Latinoware 2012.
There's a slightly non-trivial merge in virtio-net, as we cleaned up
the virtio add_buf interface while DaveM accepted the mq virtio-net
patches."
* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (27 commits)
virtio_console: Add support for remoteproc serial
virtio_console: Merge struct buffer_token into struct port_buffer
virtio: add drv_to_virtio to make code clearly
virtio: use dev_to_virtio wrapper in virtio
virtio-mmio: Fix irq parsing in command line parameter
virtio_console: Free buffers from out-queue upon close
virtio: Convert dev_printk(KERN_<LEVEL> to dev_<level>(
virtio_console: Use kmalloc instead of kzalloc
virtio_console: Free buffer if splice fails
virtio: tools: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: scsi: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: rpmsg: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: net: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: console: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: make virtqueue_add_buf() returning 0 on success, not capacity.
virtio: console: don't rely on virtqueue_add_buf() returning capacity.
virtio_net: don't rely on virtqueue_add_buf() returning capacity.
virtio-net: remove unused skb_vnet_hdr->num_sg field
virtio-net: correct capacity math on ring full
virtio: move queue_index and num_free fields into core struct virtqueue.
...
This is a batch of fixes for arm-soc platforms, most of it is for OMAP
but there are others too (i.MX, Tegra, ep93xx). Fixes warnings, some
broken platforms and drivers, etc. A bit all over the map really.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQ0BBZAAoJEIwa5zzehBx3WwgP/jS31XauTUrGLEOCUzarINB/
7ZVGkkVv9cp4AqW1lcBAyQak424ff2hxfhJWxRthBJT/fQ2OFcdZFWLkFEG2kO0y
PZ5WCxI1Q4ZNz8iW9qynIRCyhvzhaTHwA7wsGqmGRl1u2VMfoeBiAPNoxTAnpUEm
05L7EBDVSK++KgvkuoQ2KeWOII9IKNaHH4Yg5y8/guCsTbsWjzo4cjS5MGmo3s7r
6ArPr2h2WvSbayL67aPheBwg6K0ScY/JJQhL/8HgFbnnnL+mDcZ9iH5yKl/b25BX
FnQkjb37p0GUDNhXQOoElifghDF8rIAD6o5WDgTL2h5uun4WImYbMS/CktnLAQeH
wNVvrpmpxi11xf9D3SCRCM4jVAy4u1DVEL/0FElWCx1hn4iixm9hGvQ+YPq6/pMl
LOP87mzmFn+FhPtj7HIDp5B1ECw1xqcP064FHUYhMEntEHfN/Xh6gmDooesDrbUf
VuvjrSRBIeTI98xrNALepqWn5w/veZtIymZdEz9vlcK8gQ+tHNt7oI/RbDb08cap
gyMboWciAm2F6FE9QNlrjQHTbCEIrE5BoryRW87ArTbYhKvnZ1vpW01YmJHhA6LE
v6glfw6FWYAU8wlF4xbyzN1Gy3PB4kUMCGDqZzh4wU1JGXklJm3CbIf73sOwRmg7
lmnw+RYq6z/RMmPjjrb4
=4g/i
-----END PGP SIGNATURE-----
Merge tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"This is a batch of fixes for arm-soc platforms, most of it is for OMAP
but there are others too (i.MX, Tegra, ep93xx). Fixes warnings, some
broken platforms and drivers, etc. A bit all over the map really."
There was some concern about commit 68136b10 ("RM: sunxi: Change device
tree naming scheme for sunxi"), but Tony says:
"Looks like that's trivial to fix as needed, no need to rebuild the
branch to fix that AFAIK.
The fix can be done once Olof is available online again.
Linus, I suggest that you go ahead and pull this if there are no other
issues with this branch."
* tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (32 commits)
ARM: sunxi: Change device tree naming scheme for sunxi
ARM: ux500: fix missing include
ARM: u300: delete custom pin hog code
ARM: davinci: fix build break due to missing include
ARM: exynos: Fix warning due to missing 'inline' in stub
ARM: imx: Move platform-mx2-emma to arch/arm/mach-imx/devices
ARM i.MX51 clock: Fix regression since enabling MIPI/HSP clocks
ARM: dts: mx27: Fix the AIPI bus for FEC
ARM: OMAP2+: common: remove use of vram
ARM: OMAP3/4: cpuidle: fix sparse and checkpatch warnings
ARM: OMAP4: clock data: DPLLs are missing bypass clocks in their parent lists
ARM: OMAP4: clock data: div_iva_hs_clk is a power-of-two divider
ARM: OMAP4: Fix EMU clock domain always on
ARM: OMAP4460: Workaround ABE DPLL failing to turn-on
ARM: OMAP4: Enhance support for DPLLs with 4X multiplier
ARM: OMAP4: Add function table for non-M4X dplls
ARM: OMAP4: Update timer clock aliases
ARM: OMAP: Move plat/omap-serial.h to include/linux/platform_data/serial-omap.h
ARM: dts: Add build target for omap4-panda-a4
ARM: dts: OMAP2420: Correct H4 board memory size
...
Documentation says that code requiring dma-buf should add it to
select, so inline fallbacks are not going to be used. A link error
will make it obvious what went wrong, instead of silently doing
nothing at runtime.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Pull networking fixes from David Miller:
1) Really fix tuntap SKB use after free bug, from Eric Dumazet.
2) Adjust SKB data pointer to point past the transport header before
calling icmpv6_notify() so that the headers are in the state which
that function expects. From Duan Jiong.
3) Fix ambiguities in the new tuntap multi-queue APIs. From Jason
Wang.
4) mISDN needs to use del_timer_sync(), from Konstantin Khlebnikov.
5) Don't destroy mutex after freeing up device private in mac802154,
fix also from Konstantin Khlebnikov.
6) Fix INET request socket leak in TCP and DCCP, from Christoph Paasch.
7) SCTP HMAC kconfig rework, from Neil Horman.
8) Fix SCTP jprobes function signature, otherwise things explode, from
Daniel Borkmann.
9) Fix typo in ipv6-offload Makefile variable reference, from Simon
Arlott.
10) Don't fail USBNET open just because remote wakeup isn't supported,
from Oliver Neukum.
11) be2net driver bug fixes from Sathya Perla.
12) SOLOS PCI ATM driver bug fixes from Nathan Williams and David
Woodhouse.
13) Fix MTU changing regression in 8139cp driver, from John Greene.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits)
solos-pci: ensure all TX packets are aligned to 4 bytes
solos-pci: add firmware upgrade support for new models
solos-pci: remove superfluous debug output
solos-pci: add GPIO support for newer versions on Geos board
8139cp: Prevent dev_close/cp_interrupt race on MTU change
net: qmi_wwan: add ZTE MF880
drivers/net: Use of_match_ptr() macro in smsc911x.c
drivers/net: Use of_match_ptr() macro in smc91x.c
ipv6: addrconf.c: remove unnecessary "if"
bridge: Correctly encode addresses when dumping mdb entries
bridge: Do not unregister all PF_BRIDGE rtnl operations
use generic usbnet_manage_power()
usbnet: generic manage_power()
usbnet: handle PM failure gracefully
ksz884x: fix receive polling race condition
qlcnic: update driver version
qlcnic: fix unused variable warnings
net: fec: forbid FEC_PTP on SoCs that do not support
be2net: fix wrong frag_idx reported by RX CQ
be2net: fix be_close() to ensure all events are ack'ed
...
error and a missing symbol export.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQ0l2KAAoJEEFnBt12D9kB+tUQAKMjdtBO4MV9LaSham/yj+bf
f7aGoslEFHloXOvP0/Hg8+bP+Z0El5p7ncCZIROtN2pfhYad1CjbZHhwhJeeRMuJ
vQT+uy9BZ09pWoSvkLCWdUQyEGYWMxaOQtDMVHEdURU7nRBOPlntrCijoS68pcK8
Tw9XsX69Qurk/Z8q8DXf8hmNF49Cyv8ax1rywqjTTT0yzR+UzQGldXUDqEg9bg/t
Qcf03xwUSojdEBQrLo8aaFm32EUguqB02WqS1KMQBaz6FnbqmvVdVyIgVea7kgNi
mws1um6/3B/yDYB3ESKNXeZiF1bW03ccdITOeRe+mBDEz0JjnZQmQb30xaT6kjT/
z4VV1DVFAKFQ4M0HlKmv7xvcBcNdzuhrhiFteGNqDkU+zGwgl4GkiSP/c6l0CRVt
Ij8jHM+YtLmeI1ajx2V9OhM4xzK1Upo6+zi5zgGxLqflvBnFysBVuLjV8/KdeJ6Z
FW5J0iOMbIdRKdBZugj+c47qMuBjXx+BZwUxoaAbgHnFLctkF3cWhc3sLpqNi6pe
6F8GkfEIW8nc6RYLb+Rh4e29mZgAMub+GS3bqKA0bcpOjgm6zwNV27ws7lLxwz2u
d5Xf6uB2nfZGSZJtRa8mvqwEFkMoFPwaAD3XX2DmpSPZ0jTX5X1bRMa4u3UA7uJF
VIdMKi+ZzuVhLpaMtnXc
=srgY
-----END PGP SIGNATURE-----
mergetag object bc1008cf7d
type commit
tag gpio-for-linus
tagger Grant Likely <grant.likely@secretlab.ca> 1355962627 +0000
GPIO device driver bug fixes:
- gpio/mvebu-gpio: Make mvebu-gpio depend on OF_CONFIG
- gpio/ich: Add missing spinlock init
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQ0lmWAAoJEEFnBt12D9kBhxEP/0RQ87g52xXFxYpfIRRw6YIY
YlXR0noPEJqYnOUGHI0pi+P6mFDvv95etT3khEsuwVkSbb0c0fEl8m6w8O05aTd9
DHdQ4ilvUE77+xO2uZsRN6VxBgnApaj8qMdLWi09yFnSm43wegWTi38IpnY2W4PE
/qgTWT+pBraW+l4TQ/dA4hvhsTorH1McnbaNWQG1oNHbrnQ4Q43UII8qR02exz6Y
CTgo5qpgvzbz334VQ/VYPgd2NOE1wZ6BihMZr8wNVv25Ip1tWhEh4BbCjpMmGBYC
FnHfmKB3zeDdHbsnV2TZc2vC48qnMblGSkMjKsKZn0IOfH2YSym94L0VaXZhCLkw
hjtZxrDnbkHzpkLc7inc8IxbTG1wyhtDQhIf9HBzO33QPtdXw2v6GVtLJ0Ca9ikB
1T3XJZPZW+JwaEUdsw4UP7ZkJ7cwRG+fxB3iN57QDKj6Hjy+i/hA3GOjbF1VAOsb
xTrLcfnYvQ81BXdMP2rzPo2c/XdwroW6SNNxYq8xkzlVZuRT0kZSObfGSM4zFxKp
idfxwHz6ctXB1oni3XBaHmuyRXMZ9zERuyoNARPIJVw0ylSb93eq7dx+j02JOBtc
RtrBb1oVtOLiVdn6mIDA4LSELdmcbZdXHW4k84CqyNPWSRG4Sbhx3539r1m90vHw
EBnI5XpggxPt/U9qqpxH
=JGvn
-----END PGP SIGNATURE-----
mergetag object d3601e56cf
type commit
tag spi-for-linus
tagger Grant Likely <grant.likely@secretlab.ca> 1355962904 +0000
SPI device driver bug fixes branch for the v3.8 merge window. Most of
this is bug fixes to the core code and the sh-hspi and s3c64xx device
drivers.
There is also a patch here to add DT support to the Atmel driver. This
one should have been in the first round, but I missed it. It's a low
risk change contained within a single driver and the Atmel maintainer
has requested it.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQ0lxNAAoJEEFnBt12D9kBCTMP/0WvS/7iiilM00b5pmclGkRc
Ct3832KnZXn5dDYoSsP1AS2PRJAKbrMzvKMZu+ggjOAlwza87rKJJpt+KdU3r15b
+KElnaStCC88ohxEOwQV4+CJpF3dS1AmrGMQvo9RK8hddpR1dH+FgMdmex104X/A
Ysr0WgqQMY3iGVmnlPxU4EYlR1scnqOpaNADNbMY7ibuJ7LOYk1RYtqmX5yxdnJq
yABILFa10qV9N/cqPBu2Cuhw8Xtp/u70dsQPkALG//pyZnFyhw6rMJ6bowOC2IHM
p3R4IrVqJq1a3nL1IC1XX5/k21b0PBNxfOGqpeU4QoDK4/cIKeiPny1dtoI1UM+2
mpTU99VvitZ0ywolenKnbPdU61UwZ6Sd+ptsZM0Y/wIMiXDWSBDTZqVJRAbINqFb
JwwjtdaFDPgtIb6Yg1WuhLmhcOPfXFIRys7JmnS0VNQCpvvEIdFsR9l5jjMcmluR
W7V69z9FZj5lf7WjNstbAxbCvrw1i65OD6Nr1UBAA17bqNU6a9BCWdNJGwLozraR
k2ZomvfmIMhTP82z5by5UY39M9B5uGOHDxxXZOOvxzB/fxlYgLhiAGPfMY1FMrPH
48gDqQtdOpakL1B/gVELpBLMoKjwEx9jJa/8tJb5wy8EgNSw06BDSN69OclLgmV/
uVnSZWrp62odDLz0qjSR
=trRO
-----END PGP SIGNATURE-----
Merge tags 'dt-for-linus', 'gpio-for-linus' and 'spi-for-linus' of git://git.secretlab.ca/git/linux-2.6
Pull devicetree, gpio and spi bugfixes from Grant Likely:
"Device tree v3.8 bug fix:
- Fixes an undefined struct device build error and a missing symbol
export.
GPIO device driver bug fixes:
- gpio/mvebu-gpio: Make mvebu-gpio depend on OF_CONFIG
- gpio/ich: Add missing spinlock init
SPI device driver bug fixes:
- Most of this is bug fixes to the core code and the sh-hspi and
s3c64xx device drivers.
- There is also a patch here to add DT support to the Atmel driver.
This one should have been in the first round, but I missed it.
It's a low risk change contained within a single driver and the
Atmel maintainer has requested it."
* tag 'dt-for-linus' of git://git.secretlab.ca/git/linux-2.6:
of: define struct device in of_platform.h if !OF_DEVICE and !OF_ADDRESS
of: Fix export of of_find_matching_node_and_match()
* tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux-2.6:
gpio/mvebu-gpio: Make mvebu-gpio depend on OF_CONFIG
gpio/ich: Add missing spinlock init
* tag 'spi-for-linus' of git://git.secretlab.ca/git/linux-2.6:
spi/sh-hspi: fix return value check in hspi_probe().
spi: fix tegra SPI binding examples
spi/atmel: add DT support
of/spi: Fix SPI module loading by using proper "spi:" modalias prefixes.
spi: Change FIFO flush operation and spi channel off
spi: Keep chipselect assertion during one message
Conditional on CONFIG_GENERIC_SIGALTSTACK; architectures that do not
select it are completely unaffected
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Compat counterpart of current_user_stack_pointer(); for most of the biarch
architectures those two are identical, but e.g. arm64 and arm use different
registers for stack pointer...
Note that amd64 variants of current_user_stack_pointer/compat_user_stack_pointer
do *not* rely on pt_regs having been through FIXUP_TOP_OF_STACK.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cross-architecture equivalent of rdusp(); default is
user_stack_pointer(current_pt_regs()) - that works for almost all
platforms that have usp saved in pt_regs. The only exception from
that is ia64 - we want memory stack, not the backing store for
register one.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
All architectures have
CONFIG_GENERIC_KERNEL_THREAD
CONFIG_GENERIC_KERNEL_EXECVE
__ARCH_WANT_SYS_EXECVE
None of them have __ARCH_WANT_KERNEL_EXECVE and there are only two callers
of kernel_execve() (which is a trivial wrapper for do_execve() now) left.
Kill the conditionals and make both callers use do_execve().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
- Various cleanups especially in NAND tests
- Add support for NAND flash on BCMA bus
- DT support for sh_flctl and denali NAND drivers
- Kill obsolete/superceded drivers (fortunet, nomadik_nand)
- Fix JFFS2 locking bug in ENOMEM failure path
- New SPI flash chips, as usual
- Support writing in 'reliable mode' for DiskOnChip G4
- Debugfs support in nandsim
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iEYEABECAAYFAlDSAa4ACgkQdwG7hYl686MMcACeNYa//ghPtccb5L+IRXsqaFDL
Yi4AoLWOaOjN8qM4KUF/bfMEkwNGAePz
=DaAQ
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20121219' of git://git.infradead.org/linux-mtd
Pull MTD updates from David Woodhouse:
- Various cleanups especially in NAND tests
- Add support for NAND flash on BCMA bus
- DT support for sh_flctl and denali NAND drivers
- Kill obsolete/superceded drivers (fortunet, nomadik_nand)
- Fix JFFS2 locking bug in ENOMEM failure path
- New SPI flash chips, as usual
- Support writing in 'reliable mode' for DiskOnChip G4
- Debugfs support in nandsim
* tag 'for-linus-20121219' of git://git.infradead.org/linux-mtd: (96 commits)
mtd: nand: typo in nand_id_has_period() comments
mtd: nand/gpio: use io{read,write}*_rep accessors
mtd: block2mtd: throttle writes by calling balance_dirty_pages_ratelimited.
mtd: nand: gpmi: reset BCH earlier, too, to avoid NAND startup problems
mtd: nand/docg4: fix and improve read of factory bbt
mtd: nand/docg4: reserve bb marker area in ecclayout
mtd: nand/docg4: add support for writing in reliable mode
mtd: mxc_nand: reorder part_probes to let cmdline override other sources
mtd: mxc_nand: fix unbalanced clk_disable() in error path
mtd: nandsim: Introduce debugfs infrastructure
mtd: physmap_of: error checking to prevent a NULL pointer dereference
mtg: docg3: potential divide by zero in doc_write_oob()
mtd: bcm47xxnflash: writing support
mtd: tests/read: initialize buffer for whole next page
mtd: at91: atmel_nand: return bit flips for the PMECC read_page()
mtd: fix recovery after failed write-buffer operation in cfi_cmdset_0002.c
mtd: nand: onfi need to be probed in 8 bits mode
mtd: nand: add NAND_BUSWIDTH_AUTO to autodetect bus width
mtd: nand: print flash size during detection
mted: nand_wait_ready timeout fix
...
Centralise common code for manage_power() in usbnet
by making a generic simple implementation
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a device fails to do remote wakeup, this is no reason
to abort an open totally. This patch just continues without
runtime PM.
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Device managed flow steering will be enabled only under administrator
directive provided through setting the existing module parameter
log_num_mgm_entry_size to -1 (if the device actually supports flow
steering). If flow steering isn't requested or not available, the
driver will use the value of log_num_mgm_entry_size and B0 steering.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Konrad writes:
Please git pull the following branch:
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git stable/for-jens-3.8
which has a bug-fix to the xen-blkfront and xen-blkback driver
when using the persistent mode. An issue was discovered where LVM
disks could not be read correctly and this fixes it. There
is also a change in llist.h which has been blessed by akpm.
A new driver has been added for the SPEAr platform and the TWL4030/6030
driver has been replaced by two drivers that control the regular PWMs
and the PWM driven LEDs provided by the chips.
The vt8500, tiecap, tiehrpwm, i.MX, LPC32xx and Samsung drivers have all
been improved and the device tree bindings now support the PWM signal
polarity.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABAgAGBQJQ0XF+AAoJEN0jrNd/PrOhk5IP/RxjjfVM8z7i0xc6ykNRyv/3
y8jRh1miwPXeamLdW07vF2NILBtDBmZ8TUbfAMt1esf25UST79Rgol/ia+QlBb3q
w/pCDVGwlOW+2qUd34Nlb+CyKXRROIbPcFy29RZn+Z29qdkWn4LVS0nZ0UJPSPel
80P6qxkrFGG8eKsYKB7InA0g4Ds0neWRJAoYsVp5jzyOPFpUILPdaptX+iSEk1v3
VvkIx8eTDPZO9aqn8qL6bcE0g0AneF+dJ4qzLiswlsPMxsFIoVysw6n2JuEyy+FD
x0Ml86Zl4SiNzZa4Pwa1250MwFT3cvnjWbAdLb2CGJVMV/uGU6nuyfXV0Aa1XtjQ
C0k8LNshgiID8/m1/H3+/aUy9Cx/hj1KM4jchrwkCphlBiAEEryKjTPJXDwfjuBI
s5NP4rUwfNVSQT66RaNVZ7atOKyeVu+hwAKO0h6PHOsD1GwhsT+b51/YmWXmb2E5
OgLLOOHVFORfxCrsXRhWcHydMzfplOtfZ4smr4WG0hKUVn2Zp1DK1zDE2rCBB4X1
ZMas4OO9uDRY9IDXZUZUtXcDPDppI6Zx3YeE1/MWzmjWqzyEYFf5OXBbakPr40Rq
lKEwYPNf5yiqIURfFZiGDk61mrwA0Vi3i8vYfTFq1zyX9u8KbXeY8g7AhcyC1qsM
YhfCmJZ0njopu8oENMgn
=m+7I
-----END PGP SIGNATURE-----
Merge tag 'for-3.8-rc1' of git://gitorious.org/linux-pwm/linux-pwm
Pull pwm changes from Thierry Reding:
"A new driver has been added for the SPEAr platform and the
TWL4030/6030 driver has been replaced by two drivers that control the
regular PWMs and the PWM driven LEDs provided by the chips.
The vt8500, tiecap, tiehrpwm, i.MX, LPC32xx and Samsung drivers have
all been improved and the device tree bindings now support the PWM
signal polarity."
Fix up trivial conflicts due to __devinit/exit removal.
* tag 'for-3.8-rc1' of git://gitorious.org/linux-pwm/linux-pwm: (21 commits)
pwm: samsung: add missing s3c->pwm_id assignment
pwm: lpc32xx: Set the chip base for dynamic allocation
pwm: lpc32xx: Properly disable the clock on device removal
pwm: lpc32xx: Fix the PWM polarity
pwm: i.MX: eliminate build warning
pwm: Export of_pwm_xlate_with_flags()
pwm: Remove pwm-twl6030 driver
pwm: New driver to support PWM driven LEDs on TWL4030/6030 series of PMICs
pwm: New driver to support PWMs on TWL4030/6030 series of PMICs
pwm: pwm-tiehrpwm: pinctrl support
pwm: tiehrpwm: Add device-tree binding
pwm: pwm-tiehrpwm: Adding TBCLK gating support.
pwm: pwm-tiecap: pinctrl support
pwm: tiecap: Add device-tree binding
pwm: Add TI PWM subsystem driver
pwm: Device tree support for PWM polarity
pwm: vt8500: Ensure PWM clock is enabled during pwm_config
pwm: vt8500: Fix build error
pwm: spear: Staticize spear_pwm_config()
pwm: Add SPEAr PWM chip driver support
...
Fixes the following warning:
include/linux/of_platform.h:106:13: warning: 'struct device' declared
inside parameter list [enabled by default]
include/linux/of_platform.h:106:13: warning: its scope is only this
definition or declaration, which is probably not what you want [enabled
by default]
Signed-off-by: Jonas Gorski <jogo@openwrt.org>
Signed-off-by: Rob Herring <robherring2@gmail.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
to verify the source of the module (ChromeOS) and/or use standard IMA on it
or other security hooks.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQ0VKlAAoJENkgDmzRrbjxjuEQALVHpD1cSmryOzVwkNn7rVGP
PV3KVbUs+qzUCm2c3AafIIlSBm2LOUl+cR3uNC7di8aHarRF3VHkK2OQ4Fx97ECd
KKBqAyY3R0q1mAKujb/MWwiK0YgosEDIOzGGn2yQhNFsxKqnMB02P4j82IO7+g+w
Cc3XuDyWHoH2I+ySgz0Q8NHAqufD/DMZUKud7jw2Lsv6PuICJ1Oqgl/Gd/muxort
4a5tV3tjhRGywHS/8b2fbDUXkybC5NKK0FN+gyoaROmJ/THeHEQDGXZT9bc2vmVx
HvRy/5k8dzQ6LAJ2mLnPvy0pmv0u7NYMvjxTxxUlUkFMkYuVticikQfwSYDbDPt4
mbsLxchpgi8z4x8HltEERffCX5tldo/5hz1uemqhqIsMRIrRFnlHkSIgkGjVHf2u
LXQBLT8uTm6C0VyNQPrI/hUZzIax7WtKbPSoK9lmExNbKqloEFh/mVXvfQxei2kp
wnUZcnmPIqSvw7b4CWu7HibMYu2VvGBgm3YIfJRi4AQme1mzFYLpZoxF5Pj+Ykbt
T//Hb1EsNQTTFCg7MZhnJSAw/EVUvNDUoullORClyqw6+xxjVKqWpPJgYDRfWOlJ
Xa+s7DNrL+Oo1WWR8l5ruoQszbR8szIyeyPKKxRUcQj2zsqghoWuzKAx2saSEw3W
pNkoJU+dGC7kG/yVAS8N
=uoJj
-----END PGP SIGNATURE-----
Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module update from Rusty Russell:
"Nothing all that exciting; a new module-from-fd syscall for those who
want to verify the source of the module (ChromeOS) and/or use standard
IMA on it or other security hooks."
* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
MODSIGN: Fix kbuild output when using default extra_certificates
MODSIGN: Avoid using .incbin in C source
modules: don't hand 0 to vmalloc.
module: Remove a extra null character at the top of module->strtab.
ASN.1: Use the ASN1_LONG_TAG and ASN1_INDEFINITE_LENGTH constants
ASN.1: Define indefinite length marker constant
moduleparam: use __UNIQUE_ID()
__UNIQUE_ID()
MODSIGN: Add modules_sign make target
powerpc: add finit_module syscall.
ima: support new kernel module syscall
add finit_module syscall to asm-generic
ARM: add finit_module syscall to ARM
security: introduce kernel_module_from_file hook
module: add flags arg to sys_finit_module()
module: add syscall to load module from fd
to opt in to using GCC's __builtin_bswapXX() intrinsics for byteswapping,
and if we merge this now then the architecture maintainers can enable it
for their arch during the next cycle without dependency issues.
It's worth making it a par-arch opt-in, because although in *theory* the
compiler should never do worse than hand-coded assembler (and of course
it also ought to do a lot better on platforms like Atom and PowerPC which
have load-and-swap or store-and-swap instructions), that isn't always the
case. See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46453 for example.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iEYEABECAAYFAlDRvNsACgkQdwG7hYl686O7KACeKQMiuZMLB9ctF5u0Iql+33PF
+WAAnisvZ8HCUjG5E8DF6HWy45r4BjUp
=eeUs
-----END PGP SIGNATURE-----
Merge tag 'byteswap-for-linus-20121219' of git://git.infradead.org/users/dwmw2/byteswap
Pull preparatory gcc intrisics bswap patch from David Woodhouse:
"This single patch is effectively a no-op for now. It enables
architectures to opt in to using GCC's __builtin_bswapXX() intrinsics
for byteswapping, and if we merge this now then the architecture
maintainers can enable it for their arch during the next cycle without
dependency issues.
It's worth making it a par-arch opt-in, because although in *theory*
the compiler should never do worse than hand-coded assembler (and of
course it also ought to do a lot better on platforms like Atom and
PowerPC which have load-and-swap or store-and-swap instructions), that
isn't always the case. See
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46453
for example."
* tag 'byteswap-for-linus-20121219' of git://git.infradead.org/users/dwmw2/byteswap:
byteorder: allow arch to opt to use GCC intrinsics for byteswapping
Commit 8dd2cb7e88 ("block: discard granularity might not be power of
2") changed a couple of 'binary and' operations into modulus operations.
Which turned the harmless case of a zero discard_granularity into a
possible divide-by-zero.
The code also had a much more subtle bug: it was doing the modulus of a
value in bytes using 'sector_t'. That was always conceptually wrong,
but didn't actually matter back when the code assumed a power-of-two
granularity: we only looked at the low bits anyway.
But with potentially arbitrary sector numbers, using a 'sector_t' to
express bytes is very very wrong: depending on configuration it limits
the starting offset of the device to just 32 bits, and any overflow
would result in a wrong value if the modulus wasn't a power-of-two.
So re-write the code to not only protect against the divide-by-zero, but
to do the starting sector arithmetic in sectors, and using the proper
types.
[ For any mathematicians out there: it also looks monumentally stupid to
do the 'modulo granularity' operation *twice*, never mind having a "+
granularity" in the second modulus op.
But that's the easiest way to avoid negative values or overflow, and
it is how the original code was done. ]
Reported-by: Ingo Molnar <mingo@kernel.org>
Reported-by: Doug Anderson <dianders@chromium.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Shaohua Li <shli@fusionio.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull i2c-embedded changes from Wolfram Sang:
- CBUS driver (an I2C variant)
- continued rework of the omap driver
- s3c2410 gets lots of fixes and gains pinctrl support
- at91 gains DMA support
- the GPIO muxer gains devicetree probing
- typical fixes and additions all over
* 'i2c-embedded/for-next' of git://git.pengutronix.de/git/wsa/linux: (45 commits)
i2c: omap: Remove the OMAP_I2C_FLAG_RESET_REGS_POSTIDLE flag
i2c: at91: add dma support
i2c: at91: change struct members indentation
i2c: at91: fix compilation warning
i2c: mxs: Do not disable the I2C SMBus quick mode
i2c: mxs: Handle i2c DMA failure properly
i2c: s3c2410: Remove recently introduced performance overheads
i2c: ocores: Move grlib set/get functions into #ifdef CONFIG_OF block
i2c: s3c2410: Add fix for i2c suspend/resume
i2c: s3c2410: Fix code to free gpios
i2c: i2c-cbus-gpio: introduce driver
i2c: ocores: Add support for the GRLIB port of the controller and use function pointers for getreg and setreg functions
i2c: ocores: Add irq support for sparc
i2c: omap: Move the remove constraint
ARM: dts: cfa10049: Add the i2c muxer buses to the CFA-10049
i2c: s3c2410: do not special case HDMIPHY stuck bus detection
i2c: s3c2410: use exponential back off while polling for bus idle
i2c: s3c2410: do not generate STOP for QUIRK_HDMIPHY
i2c: s3c2410: grab adapter lock while changing i2c clock
i2c: s3c2410: Add support for pinctrl
...
Merge patches from Andrew Morton:
"Most of the rest of MM, plus a few dribs and drabs.
I still have quite a few irritating patches left around: ones with
dubious testing results, lack of review, ones which should have gone
via maintainer trees but the maintainers are slack, etc.
I need to be more activist in getting these things wrapped up outside
the merge window, but they're such a PITA."
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (48 commits)
mm/vmscan.c: avoid possible deadlock caused by too_many_isolated()
vmscan: comment too_many_isolated()
mm/kmemleak.c: remove obsolete simple_strtoul
mm/memory_hotplug.c: improve comments
mm/hugetlb: create hugetlb cgroup file in hugetlb_init
mm/mprotect.c: coding-style cleanups
Documentation: ABI: /sys/devices/system/node/
slub: drop mutex before deleting sysfs entry
memcg: add comments clarifying aspects of cache attribute propagation
kmem: add slab-specific documentation about the kmem controller
slub: slub-specific propagation changes
slab: propagate tunable values
memcg: aggregate memcg cache values in slabinfo
memcg/sl[au]b: shrink dead caches
memcg/sl[au]b: track all the memcg children of a kmem_cache
memcg: destroy memcg caches
sl[au]b: allocate objects from memcg cache
sl[au]b: always get the cache from its page in kmem_cache_free()
memcg: skip memcg kmem allocations in specified code regions
memcg: infrastructure to match an allocation to the right cache
...
Build kernel with CONFIG_HUGETLBFS=y,CONFIG_HUGETLB_PAGE=y and
CONFIG_CGROUP_HUGETLB=y, then specify hugepagesz=xx boot option, system
will fail to boot.
This failure is caused by following code path:
setup_hugepagesz
hugetlb_add_hstate
hugetlb_cgroup_file_init
cgroup_add_cftypes
kzalloc <--slab is *not available* yet
For this path, slab is not available yet, so memory allocated will be
failed, and cause WARN_ON() in hugetlb_cgroup_file_init().
So I move hugetlb_cgroup_file_init() into hugetlb_init().
[akpm@linux-foundation.org: tweak coding-style, remove pointless __init on inlined function]
[akpm@linux-foundation.org: fix warning]
Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch clarifies two aspects of cache attribute propagation.
First, the expected context for the for_each_memcg_cache macro in
memcontrol.h. The usages already in the codebase are safe. In mm/slub.c,
it is trivially safe because the lock is acquired right before the loop.
In mm/slab.c, it is less so: the lock is acquired by an outer function a
few steps back in the stack, so a VM_BUG_ON() is added to make sure it is
indeed safe.
A comment is also added to detail why we are returning the value of the
parent cache and ignoring the children's when we propagate the attributes.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SLUB allows us to tune a particular cache behavior with sysfs-based
tunables. When creating a new memcg cache copy, we'd like to preserve any
tunables the parent cache already had.
This can be done by tapping into the store attribute function provided by
the allocator. We of course don't need to mess with read-only fields.
Since the attributes can have multiple types and are stored internally by
sysfs, the best strategy is to issue a ->show() in the root cache, and
then ->store() in the memcg cache.
The drawback of that, is that sysfs can allocate up to a page in buffering
for show(), that we are likely not to need, but also can't guarantee. To
avoid always allocating a page for that, we can update the caches at store
time with the maximum attribute size ever stored to the root cache. We
will then get a buffer big enough to hold it. The corolary to this, is
that if no stores happened, nothing will be propagated.
It can also happen that a root cache has its tunables updated during
normal system operation. In this case, we will propagate the change to
all caches that are already active.
[akpm@linux-foundation.org: tweak code to avoid __maybe_unused]
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SLAB allows us to tune a particular cache behavior with tunables. When
creating a new memcg cache copy, we'd like to preserve any tunables the
parent cache already had.
This could be done by an explicit call to do_tune_cpucache() after the
cache is created. But this is not very convenient now that the caches are
created from common code, since this function is SLAB-specific.
Another method of doing that is taking advantage of the fact that
do_tune_cpucache() is always called from enable_cpucache(), which is
called at cache initialization. We can just preset the values, and then
things work as expected.
It can also happen that a root cache has its tunables updated during
normal system operation. In this case, we will propagate the change to
all caches that are already active.
This change will require us to move the assignment of root_cache in
memcg_params a bit earlier. We need this to be already set - which
memcg_kmem_register_cache will do - when we reach __kmem_cache_create()
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When we create caches in memcgs, we need to display their usage
information somewhere. We'll adopt a scheme similar to /proc/meminfo,
with aggregate totals shown in the global file, and per-group information
stored in the group itself.
For the time being, only reads are allowed in the per-group cache.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This enables us to remove all the children of a kmem_cache being
destroyed, if for example the kernel module it's being used in gets
unloaded. Otherwise, the children will still point to the destroyed
parent.
Signed-off-by: Suleiman Souhlal <suleiman@google.com>
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implement destruction of memcg caches. Right now, only caches where our
reference counter is the last remaining are deleted. If there are any
other reference counters around, we just leave the caches lying around
until they go away.
When that happens, a destruction function is called from the cache code.
Caches are only destroyed in process context, so we queue them up for
later processing in the general case.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We are able to match a cache allocation to a particular memcg. If the
task doesn't change groups during the allocation itself - a rare event,
this will give us a good picture about who is the first group to touch a
cache page.
This patch uses the now available infrastructure by calling
memcg_kmem_get_cache() before all the cache allocations.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
struct page already has this information. If we start chaining caches,
this information will always be more trustworthy than whatever is passed
into the function.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Create a mechanism that skip memcg allocations during certain pieces of
our core code. It basically works in the same way as
preempt_disable()/preempt_enable(): By marking a region under which all
allocations will be accounted to the root memcg.
We need this to prevent races in early cache creation, when we
allocate data using caches that are not necessarily created already.
Signed-off-by: Glauber Costa <glommer@parallels.com>
yCc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The page allocator is able to bind a page to a memcg when it is
allocated. But for the caches, we'd like to have as many objects as
possible in a page belonging to the same cache.
This is done in this patch by calling memcg_kmem_get_cache in the
beginning of every allocation function. This function is patched out by
static branches when kernel memory controller is not being used.
It assumes that the task allocating, which determines the memcg in the
page allocator, belongs to the same cgroup throughout the whole process.
Misaccounting can happen if the task calls memcg_kmem_get_cache() while
belonging to a cgroup, and later on changes. This is considered
acceptable, and should only happen upon task migration.
Before the cache is created by the memcg core, there is also a possible
imbalance: the task belongs to a memcg, but the cache being allocated from
is the global cache, since the child cache is not yet guaranteed to be
ready. This case is also fine, since in this case the GFP_KMEMCG will not
be passed and the page allocator will not attempt any cgroup accounting.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Every cache that is considered a root cache (basically the "original"
caches, tied to the root memcg/no-memcg) will have an array that should be
large enough to store a cache pointer per each memcg in the system.
Theoreticaly, this is as high as 1 << sizeof(css_id), which is currently
in the 64k pointers range. Most of the time, we won't be using that much.
What goes in this patch, is a simple scheme to dynamically allocate such
an array, in order to minimize memory usage for memcg caches. Because we
would also like to avoid allocations all the time, at least for now, the
array will only grow. It will tend to be big enough to hold the maximum
number of kmem-limited memcgs ever achieved.
We'll allocate it to be a minimum of 64 kmem-limited memcgs. When we have
more than that, we'll start doubling the size of this array every time the
limit is reached.
Because we are only considering kmem limited memcgs, a natural point for
this to happen is when we write to the limit. At that point, we already
have set_limit_mutex held, so that will become our natural synchronization
mechanism.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow a memcg parameter to be passed during cache creation. When the slub
allocator is being used, it will only merge caches that belong to the same
memcg. We'll do this by scanning the global list, and then translating
the cache to a memcg-specific cache
Default function is created as a wrapper, passing NULL to the memcg
version. We only merge caches that belong to the same memcg.
A helper is provided, memcg_css_id: because slub needs a unique cache name
for sysfs. Since this is visible, but not the canonical location for slab
data, the cache name is not used, the css_id should suffice.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For the kmem slab controller, we need to record some extra information in
the kmem_cache structure.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Suleiman Souhlal <suleiman@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Because those architectures will draw their stacks directly from the page
allocator, rather than the slab cache, we can directly pass __GFP_KMEMCG
flag, and issue the corresponding free_pages.
This code path is taken when the architecture doesn't define
CONFIG_ARCH_THREAD_INFO_ALLOCATOR (only ia64 seems to), and has
THREAD_SIZE >= PAGE_SIZE. Luckily, most - if not all - of the remaining
architectures fall in this category.
This will guarantee that every stack page is accounted to the memcg the
process currently lives on, and will have the allocations to fail if they
go over limit.
For the time being, I am defining a new variant of THREADINFO_GFP, not to
mess with the other path. Once the slab is also tracked by memcg, we can
get rid of that flag.
Tested to successfully protect against :(){ :|:& };:
Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Frederic Weisbecker <fweisbec@redhat.com>
Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We can use static branches to patch the code in or out when not used.
Because the _ACTIVE bit on kmem_accounted is only set after the increment
is done, we guarantee that the root memcg will always be selected for kmem
charges until all call sites are patched (see memcg_kmem_enabled). This
guarantees that no mischarges are applied.
Static branch decrement happens when the last reference count from the
kmem accounting in memcg dies. This will only happen when the charges
drop down to 0.
When that happens, we need to disable the static branch only on those
memcgs that enabled it. To achieve this, we would be forced to complicate
the code by keeping track of which memcgs were the ones that actually
enabled limits, and which ones got it from its parents.
It is a lot simpler just to do static_key_slow_inc() on every child
that is accounted.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It is useful to know how many charges are still left after a call to
res_counter_uncharge. While it is possible to issue a res_counter_read
after uncharge, this can be racy.
If we need, for instance, to take some action when the counters drop down
to 0, only one of the callers should see it. This is the same semantics
as the atomic variables in the kernel.
Since the current return value is void, we don't need to worry about
anything breaking due to this change: nobody relied on that, and only
users appearing from now on will be checking this value.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When a process tries to allocate a page with the __GFP_KMEMCG flag, the
page allocator will call the corresponding memcg functions to validate
the allocation. Tasks in the root memcg can always proceed.
To avoid adding markers to the page - and a kmem flag that would
necessarily follow, as much as doing page_cgroup lookups for no reason,
whoever is marking its allocations with __GFP_KMEMCG flag is responsible
for telling the page allocator that this is such an allocation at
free_pages() time. This is done by the invocation of
__free_accounted_pages() and free_accounted_pages().
Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce infrastructure for tracking kernel memory pages to a given
memcg. This will happen whenever the caller includes the flag
__GFP_KMEMCG flag, and the task belong to a memcg other than the root.
In memcontrol.h those functions are wrapped in inline acessors. The idea
is to later on, patch those with static branches, so we don't incur any
overhead when no mem cgroups with limited kmem are being used.
Users of this functionality shall interact with the memcg core code
through the following functions:
memcg_kmem_newpage_charge: will return true if the group can handle the
allocation. At this point, struct page is not
yet allocated.
memcg_kmem_commit_charge: will either revert the charge, if struct page
allocation failed, or embed memcg information
into page_cgroup.
memcg_kmem_uncharge_page: called at free time, will revert the charge.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This flag is used to indicate to the callees that this allocation is a
kernel allocation in process context, and should be accounted to current's
memcg.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: JoonSoo Kim <js1304@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull second round of input updates from Dmitry Torokhov:
"As usual, there are a couple of new drivers, input core now supports
managed input devices (devres), a slew of drivers now have device tree
support and a bunch of fixes and cleanups."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (71 commits)
Input: walkera0701 - fix crash on startup
Input: matrix-keymap - provide a proper module license
Input: gpio_keys_polled - switch to using gpio_request_one()
Input: gpio_keys - switch to using gpio_request_one()
Input: wacom - fix touch support for Bamboo Fun CTH-461
Input: xpad - add a few new VID/PID combinations
Input: xpad - minor formatting fixes
Input: gpio-keys-polled - honor 'autorepeat' setting in platform data
Input: tca8418-keypad - switch to using managed resources
Input: tca8418_keypad - increase severity of failures in probe()
Input: tca8418_keypad - move device ID tables closer to where they are used
Input: tca8418_keypad - use dev_get_platdata() to retrieve platform data
Input: tca8418_keypad - use a temporary variable for parent device
Input: tca8418_keypad - add support for shared interrupt
Input: tca8418_keypad - add support for device tree bindings
Input: remove Compaq iPAQ H3600 (Bitsy) touchscreen driver
Input: bu21013_ts - add support for Device Tree booting
Input: bu21013_ts - move GPIO init and exit functions into the driver
Input: bu21013_ts - request regulator that actually exists
ARM: ux500: Strip out duplicate touch screen platform information
...
Pull SLAB changes from Pekka Enberg:
"This contains preparational work from Christoph Lameter and Glauber
Costa for SLAB memcg and cleanups and improvements from Ezequiel
Garcia and Joonsoo Kim.
Please note that the SLOB cleanup commit from Arnd Bergmann already
appears in your tree but I had also merged it myself which is why it
shows up in the shortlog."
* 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
mm/sl[aou]b: Common alignment code
slab: Use the new create_boot_cache function to simplify bootstrap
slub: Use statically allocated kmem_cache boot structure for bootstrap
mm, sl[au]b: create common functions for boot slab creation
slab: Simplify bootstrap
slub: Use correct cpu_slab on dead cpu
mm: fix slab.c kernel-doc warnings
mm/slob: use min_t() to compare ARCH_SLAB_MINALIGN
slab: Ignore internal flags in cache creation
mm/slob: Use free_page instead of put_page for page-size kmalloc allocations
mm/sl[aou]b: Move common kmem_cache_size() to slab.h
mm/slob: Use object_size field in kmem_cache_size()
mm/slob: Drop usage of page->private for storing page-sized allocations
slub: Commonize slab_cache field in struct page
sl[au]b: Process slabinfo_show in common code
mm/sl[au]b: Move print_slabinfo_header to slab_common.c
mm/sl[au]b: Move slabinfo processing to slab_common.c
slub: remove one code path and reduce lock contention in __slab_free()
Pull powerpc update from Benjamin Herrenschmidt:
"The main highlight is probably some base POWER8 support. There's more
to come such as transactional memory support but that will wait for
the next one.
Overall it's pretty quiet, or rather I've been pretty poor at picking
things up from patchwork and reviewing them this time around and Kumar
no better on the FSL side it seems..."
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (73 commits)
powerpc+of: Rename and fix OF reconfig notifier error inject module
powerpc: mpc5200: Add a3m071 board support
powerpc/512x: don't compile any platform DIU code if the DIU is not enabled
powerpc/mpc52xx: use module_platform_driver macro
powerpc+of: Export of_reconfig_notifier_[register,unregister]
powerpc/dma/raidengine: add raidengine device
powerpc/iommu/fsl: Add PAMU bypass enable register to ccsr_guts struct
powerpc/mpc85xx: Change spin table to cached memory
powerpc/fsl-pci: Add PCI controller ATMU PM support
powerpc/86xx: fsl_pcibios_fixup_bus requires CONFIG_PCI
drivers/virt: the Freescale hypervisor driver doesn't need to check MSR[GS]
powerpc/85xx: p1022ds: Use NULL instead of 0 for pointers
powerpc: Disable relocation on exceptions when kexecing
powerpc: Enable relocation on during exceptions at boot
powerpc: Move get_longbusy_msecs into hvcall.h and remove duplicate function
powerpc: Add wrappers to enable/disable relocation on exceptions
powerpc: Add set_mode hcall
powerpc: Setup relocation on exceptions for bare metal systems
powerpc: Move initial mfspr LPCR out of __init_LPCR
powerpc: Add relocation on exception vector handlers
...
Features include:
- Full audit of BUG_ON asserts in the NFS, SUNRPC and lockd client code
Remove altogether where possible, and replace with WARN_ON_ONCE and
appropriate error returns where not.
- NFSv4.1 client adds session dynamic slot table management. There is
matching server side code that has been submitted to Bruce for
consideration. Together, this code allows the server to dynamically
manage the amount of memory it allocates to the duplicate request
cache for each client. It will constantly resize those caches to
reserve more memory for clients that are hot while shrinking caches
for those that are quiescent.
In addition, there are assorted bugfixes for the generic NFS write code,
fixes to deal with the drop_nlink() warnings, and yet another fix for
NFSv4 getacl.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJQz8VNAAoJEGcL54qWCgDy7iYQAKbr7AAZOcZPoJigzakZ7nMi
UKYulGbFais2Llwzw1e+U5RzmorTSbvl7/m8eS7pDf3auYw/t4xtXjKSGZUNxaE1
q2hNKgVwodMbScYdkZXvKKNckS93oPDttrmEyzjKanqey+1E3HSklvOvikN0ihte
B/G1OtA7Qpcr92bPrLK+PjDqarCBUI4g42dYbZOBrZnXKTRtzUqsuKPu7WjpPiof
SHE5b1Emt7oUxgcijWGcvYCQ8voZdeSCnSksH3DgvORlutwdhUD3Yg8KyEfFZdyc
6C59ozXRLiHkV3c+jMhJzDkQXR9bYHrnK3tlq4G8v1NdJxRktQliZeqecRvip/Wz
rAxfE6fnPDEvKsCpZb3+5yTAt+aZwzEhRg1fFC9qfGOp+oRa+CWw5kJCyIFHwJu6
4LOlubQAf6rnIsja1L8D0FdeqHUa1+wy61On5kgVYS5JGtoBsQHpa1zTwdOxPmsR
2XTMYGNCEabvpKpO9+5xQbUzkFExPTesw47ygXiUuDT/snaarpV3/f05SSCaWZkX
R8QsGEOXTIh8/S+UxARGpc7H6xi1PdBM5nBziHVzjEdHgZRF4wGFaJe2CirMjSJO
Df5GEd5Z/8VCGWs+1w7HD5EaQ2n0wbt5daCE80Y2jRBr7NMYnY+ciF8/GktLpHsn
Zq1bXGOdr3UZ92LXuzL9
=G3N9
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Features include:
- Full audit of BUG_ON asserts in the NFS, SUNRPC and lockd client
code. Remove altogether where possible, and replace with
WARN_ON_ONCE and appropriate error returns where not.
- NFSv4.1 client adds session dynamic slot table management. There
is matching server side code that has been submitted to Bruce for
consideration.
Together, this code allows the server to dynamically manage the
amount of memory it allocates to the duplicate request cache for
each client. It will constantly resize those caches to reserve
more memory for clients that are hot while shrinking caches for
those that are quiescent.
In addition, there are assorted bugfixes for the generic NFS write
code, fixes to deal with the drop_nlink() warnings, and yet another
fix for NFSv4 getacl."
* tag 'nfs-for-3.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (106 commits)
SUNRPC: continue run over clients list on PipeFS event instead of break
NFS: Don't use SetPageError in the NFS writeback code
SUNRPC: variable 'svsk' is unused in function bc_send_request
SUNRPC: Handle ECONNREFUSED in xs_local_setup_socket
NFSv4.1: Deal effectively with interrupted RPC calls.
NFSv4.1: Move the RPC timestamp out of the slot.
NFSv4.1: Try to deal with NFS4ERR_SEQ_MISORDERED.
NFS: nfs_lookup_revalidate should not trust an inode with i_nlink == 0
NFS: Fix calls to drop_nlink()
NFS: Ensure that we always drop inodes that have been marked as stale
nfs: Remove unused list nfs4_clientid_list
nfs: Remove duplicate function declaration in internal.h
NFS: avoid NULL dereference in nfs_destroy_server
SUNRPC handle EKEYEXPIRED in call_refreshresult
SUNRPC set gss gc_expiry to full lifetime
nfs: fix page dirtying in NFS DIO read codepath
nfs: don't zero out the rest of the page if we hit the EOF on a DIO READ
NFSv4.1: Be conservative about the client highest slotid
NFSv4.1: Handle NFS4ERR_BADSLOT errors correctly
nfs: don't extend writes to cover entire page if pagecache is invalid
...
Mostly just little fixes. Probably biggest part is
AVX accelerated RAID6 calculations.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIVAwUAUM/w2Dnsnt1WYoG5AQKXlg/9F5juv4CjRkRRFLqZgOPBLmn/s/2Vspgh
2Kv8Jcyixd8jUQNbobZv0ahlJH/iSU61kpOE8QjLbKi5Y42vAbM0ZU2aHJ6nqGZy
HiTI8K+7kTvCK3ZXLcUQ+4oPPBNTcoTZbLWaEOmIqB1ruLddoIR7M9fG3PspVeG0
jijnXR8IfL6mr4YDXnJkEhFrneTysVik05RkKYZKyM/9r3stAoMJ9o0/EFy3OFxb
lO6mLEtvjVArXcnuf1RMCw2YKgki9Y4r73HCplgQsVFvcxcpsya4gFF+lRR5j7cO
/eMYbSQ89iWEYKh1dJ9u1nofc8fX5ia71QQyO1fkO4GXRHXPVIyBgKSbe7SaL6iG
JUMm7idUV2rZGeq3ln3k8Yor4QqHvN1n7pRKKUF+ZdsPoQ1B/TABu+qpsAdo5ZhP
fxDsULsHrzEaxgetd4V8F2Uptca9ni43sMI8mwsvVlA0p6SOzMIyoJLC9xAZpx11
b3H3+7Oje/fasmszBoq5B9uAlSt9XXVN4DDn2q6cX+S96JSX6jcsN1c6cJBO+ZxB
OU6a6P5mnU6HuxU02rspe7G8BeU+ybaonErOW+GdyC4r7M/cImC0dSp0NGHK2211
oqu0xBx/Q/ddTFwKQqa4HzR2ws09+LhKbjdqYIhCEKttIbLIAjf73ARZ19XPSRRX
pDR/ey2CB6E=
=uK52
-----END PGP SIGNATURE-----
Merge tag 'md-3.8' of git://neil.brown.name/md
Pull md update from Neil Brown:
"Mostly just little fixes. Probably biggest part is AVX accelerated
RAID6 calculations."
* tag 'md-3.8' of git://neil.brown.name/md:
md/raid5: add blktrace calls
md/raid5: use async_tx_quiesce() instead of open-coding it.
md: Use ->curr_resync as last completed request when cleanly aborting resync.
lib/raid6: build proper files on corresponding arch
lib/raid6: Add AVX2 optimized gen_syndrome functions
lib/raid6: Add AVX2 optimized recovery functions
md: Update checkpoint of resync/recovery based on time.
md:Add place to update ->recovery_cp.
md.c: re-indent various 'switch' statements.
md: close race between removing and adding a device.
md: removed unused variable in calc_sb_1_csm.
Fix up a trivial merge conflict with commit baaf1dd ("mm/slob: use
min_t() to compare ARCH_SLAB_MINALIGN") that did not go through the slab
tree.
Conflicts:
mm/slob.c
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Merge misc patches from Andrew Morton:
"Incoming:
- lots of misc stuff
- backlight tree updates
- lib/ updates
- Oleg's percpu-rwsem changes
- checkpatch
- rtc
- aoe
- more checkpoint/restart support
I still have a pile of MM stuff pending - Pekka should be merging
later today after which that is good to go. A number of other things
are twiddling thumbs awaiting maintainer merges."
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (180 commits)
scatterlist: don't BUG when we can trivially return a proper error.
docs: update documentation about /proc/<pid>/fdinfo/<fd> fanotify output
fs, fanotify: add @mflags field to fanotify output
docs: add documentation about /proc/<pid>/fdinfo/<fd> output
fs, notify: add procfs fdinfo helper
fs, exportfs: add exportfs_encode_inode_fh() helper
fs, exportfs: escape nil dereference if no s_export_op present
fs, epoll: add procfs fdinfo helper
fs, eventfd: add procfs fdinfo helper
procfs: add ability to plug in auxiliary fdinfo providers
tools/testing/selftests/kcmp/kcmp_test.c: print reason for failure in kcmp_test
breakpoint selftests: print failure status instead of cause make error
kcmp selftests: print fail status instead of cause make error
kcmp selftests: make run_tests fix
mem-hotplug selftests: print failure status instead of cause make error
cpu-hotplug selftests: print failure status instead of cause make error
mqueue selftests: print failure status instead of cause make error
vm selftests: print failure status instead of cause make error
ubifs: use prandom_bytes
mtd: nandsim: use prandom_bytes
...
Add drv_to_virtio wrapper to get virtio_driver from device_driver.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Use dev_to_virtio wrapper in virtio to make code clearly.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They're generic concepts, so hoist them. This also avoids accessor
functions (though kept around for merge with DaveM's net tree).
This goes even further than Jason Wang's 17bb6d4088 patch
("virtio-ring: move queue_index to vring_virtqueue") which moved the
queue_index from the specific transport.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Implement a safe version of llist_for_each_entry, and use it in
blkif_free. Previously grants where freed while iterating the list,
which lead to dereferences when trying to fetch the next item.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
[v2: Move the llist_for_each_entry_safe in llist.h]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
conflict with the fb code, few timer warnings, and longer
term regressions for tfp410 and omap h4 ethernet. Also
included is a GPIO mode fix for the legacy mux code.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJQz2hxAAoJEBvUPslcq6VzIHMP/i6QowlZm742prpsyeOemU+a
NZa+bfA58Pps+96zGHuCOfpPsJ8MqvYdwtOyH9NiY648A44LrKCsaqGdEzuXo04T
wlgNsnMWaOBNzk5pQuUO+IlCorJpv0GXwSW4DO61UVt1A3AdLo1RSl2mq32xM6Qt
0ZHEhXbbL14QidwmKRo1zubn+nEQKiWRD2+Bj+ccDq3A2lsn6itd+YNnHB0QNwUQ
+nibTleSV1QecCl0UDIwA4cTyZkeDafoAHaUH1jhIAQ30lZx1r7XbmEk6YJ51v+f
f02i8tUZoX97Pr82FJnA/T28dkFXWlsliJEQ6pUYW7Y2vMjyLXgBIDQytyS5sD1H
Eifk6WVuBGTd/mYhcNr7Q/MueIl/aT+c5tzZL3lVnxHZj7CUlt5li2SJQZZBlHN0
Ix1rtu1YVPdrAqZxn4GV4jqc9e+ULYvP7GZhrJuiBOgHsPzHynOYWZNLN5IVBrbZ
2zKvk1vHGMKbfEm/oFvjudBm4VRpcgp+W1Xp84YTmKPf/ZlhvjpbZK5whb15LsDM
N9ufVvNxq6JNo3+8AtQIsch9t0CONXwyRB48lSYte55QgMpHUOMKQZv5NcBsj4RK
AlvrqZ0PIJrFSRPwKPgWpC5VLP7hwdCMWAAB4nPAxTgjwrT35qoxZgLKiVNQCYrJ
9NfIbnESzxXKHUFc0NFJ
=8+P4
-----END PGP SIGNATURE-----
Merge tag 'omap-for-v3.8/fixes-for-merge-window-v4-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
From Tony Lindgren:
These patches fixes a build error caused by a merge
conflict with the fb code, few timer warnings, and longer
term regressions for tfp410 and omap h4 ethernet. Also
included is a GPIO mode fix for the legacy mux code.
* tag 'omap-for-v3.8/fixes-for-merge-window-v4-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP2+: common: remove use of vram
ARM: OMAP: Move plat/omap-serial.h to include/linux/platform_data/serial-omap.h
ARM: dts: Add build target for omap4-panda-a4
ARM: dts: OMAP2420: Correct H4 board memory size
mfd: omap-usb-host: get rid of cpu_is_omap..() macros
ARM: OMAP: Remove debug-devices.c
ARM: OMAP2420: Fix ethernet support for OMAP2420 H4
OMAP2+: mux: Fixed gpio mux mode analysis
OMAP: board-files: fix i2c_bus for tfp410
ARM: OMAP2+: Fix sparse warnings in timer.c
ARM: AM335x: Fix warning in timer.c
ARM: OMAP2+: Fix realtime_counter_init warning in timer.c
We will need this helper in the next patch to provide a file handle for
inotify marks in /proc/pid/fdinfo output.
The patch is rather providing the way to use inodes directly when dentry
is not available (like in case of inotify system).
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: James Bottomley <jbottomley@parallels.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matthew Helsley <matt.helsley@gmail.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch brings ability to print out auxiliary data associated with
file in procfs interface /proc/pid/fdinfo/fd.
In particular further patches make eventfd, evenpoll, signalfd and
fsnotify to print additional information complete enough to restore
these objects after checkpoint.
To simplify the code we add show_fdinfo callback inside struct
file_operations (as Al and Pavel are proposing).
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: James Bottomley <jbottomley@parallels.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matthew Helsley <matt.helsley@gmail.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add functions to get the requested number of pseudo-random bytes.
The difference from get_random_bytes() is that it generates pseudo-random
numbers by prandom_u32(). It doesn't consume the entropy pool, and the
sequence is reproducible if the same rnd_state is used. So it is suitable
for generating random bytes for testing.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Eilon Greenstein <eilong@broadcom.com>
Cc: David Laight <david.laight@aculab.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Robert Love <robert.w.love@intel.com>
Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This renames all random32 functions to have 'prandom_' prefix as follows:
void prandom_seed(u32 seed); /* rename from srandom32() */
u32 prandom_u32(void); /* rename from random32() */
void prandom_seed_state(struct rnd_state *state, u64 seed);
/* rename from prandom32_seed() */
u32 prandom_u32_state(struct rnd_state *state);
/* rename from prandom32() */
The purpose of this renaming is to prevent some kernel developers from
assuming that prandom32() and random32() might imply that only
prandom32() was the one using a pseudo-random number generator by
prandom32's "p", and the result may be a very embarassing security
exposure. This concern was expressed by Theodore Ts'o.
And furthermore, I'm going to introduce new functions for getting the
requested number of pseudo-random bytes. If I continue to use both
prandom32 and random32 prefixes for these functions, the confusion
is getting worse.
As a result of this renaming, "prandom_" is the common prefix for
pseudo-random number library.
Currently, srandom32() and random32() are preserved because it is
difficult to rename too many users at once.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Robert Love <robert.w.love@intel.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Cc: David Laight <david.laight@aculab.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
linux/compiler.h has macros to denote functions that acquire or release
locks, but not to denote functions called with a lock held that return
with the lock still held. Add a __must_hold macro to cover that case.
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Reported-by: Ed Cashin <ecashin@coraid.com>
Tested-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
To avoid an explosion of request_module calls on a chain of abusive
scripts, fail maximum recursion with -ELOOP instead of -ENOEXEC. As soon
as maximum recursion depth is hit, the error will fail all the way back
up the chain, aborting immediately.
This also has the side-effect of stopping the user's shell from attempting
to reexecute the top-level file as a shell script. As seen in the
dash source:
if (cmd != path_bshell && errno == ENOEXEC) {
*argv-- = cmd;
*argv = cmd = path_bshell;
goto repeat;
}
The above logic was designed for running scripts automatically that lacked
the "#!" header, not to re-try failed recursion. On a legitimate -ENOEXEC,
things continue to behave as the shell expects.
Additionally, when tracking recursion, the binfmt handlers should not be
involved. The recursion being tracked is the depth of calls through
search_binary_handler(), so that function should be exclusively responsible
for tracking the depth.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: halfdog <me@halfdog.net>
Cc: P J P <ppandit@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ptrace jailers want to be sure that the tracee can never escape
from the control. However if the tracer dies unexpectedly the
tracee continues to run in potentially unsafe mode.
Add the new ptrace option PTRACE_O_EXITKILL. If the tracer exits
it sends SIGKILL to every tracee which has this bit set.
Note that the new option is not equal to the last-option << 1. Because
currently all options have an event, and the new one starts the eventless
group. It uses the random 20 bit, so we have the room for 12 more events,
but we can also add the new eventless options below this one.
Suggested by Amnon Shiloh.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Tested-by: Amnon Shiloh <u3557@miso.sublimeip.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Chris Evans <scarybeasts@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As Bruce Fields pointed out, kstrto* is currently lacking kerneldoc
comments. This patch adds kerneldoc comments to common variants of
kstrto*: kstrto(u)l, kstrto(u)ll and kstrto(u)int.
Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Joe Perches <joe@perches.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Rob Landley <rob@landley.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This function is used by sparc, powerpc tile and arm64 for compat support.
The patch adds a generic implementation with a wrapper for PowerPC to do
the u32->int sign extension.
The reason for a single patch covering powerpc, tile, sparc and arm64 is
to keep it bisectable, otherwise kernel building may fail with mismatched
function declarations.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com> [for tile]
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add lockdep annotations. Not only this can help to find the potential
problems, we do not want the false warnings if, say, the task takes two
different percpu_rw_semaphore's for reading. IOW, at least ->rw_sem
should not use a single class.
This patch exposes this internal lock to lockdep so that it represents the
whole percpu_rw_semaphore. This way we do not need to add another "fake"
->lockdep_map and lock_class_key. More importantly, this also makes the
output from lockdep much more understandable if it finds the problem.
In short, with this patch from lockdep pov percpu_down_read() and
percpu_up_read() acquire/release ->rw_sem for reading, this matches the
actual semantics. This abuses __up_read() but I hope this is fine and in
fact I'd like to have down_read_no_lockdep() as well,
percpu_down_read_recursive_readers() will need it.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Arapov <anton@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
percpu_rw_semaphore->writer_mutex was only added to simplify the initial
rewrite, the only thing it protects is clear_fast_ctr() which otherwise
could be called by multiple writers. ->rw_sem is enough to serialize the
writers.
Kill this mutex and add "atomic_t write_ctr" instead. The writers
increment/decrement this counter, the readers check it is zero instead of
mutex_is_locked().
Move atomic_add(clear_fast_ctr(), slow_read_ctr) under down_write() to
avoid the race with other writers. This is a bit sub-optimal, only the
first writer needs this and we do not need to exclude the readers at this
stage. But this is simple, we do not want another internal lock until we
add more features.
And this speeds up the write-contended case. Before this patch the racing
writers sleep in synchronize_sched_expedited() sequentially, with this
patch multiple synchronize_sched_expedited's can "overlap" with each
other. Note: we can do more optimizations, this is only the first step.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Arapov <anton@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently the writer does msleep() plus synchronize_sched() 3 times to
acquire/release the semaphore, and during this time the readers are
blocked completely. Even if the "write" section was not actually started
or if it was already finished.
With this patch down_write/up_write does synchronize_sched() twice and
down_read/up_read are still possible during this time, just they use the
slow path.
percpu_down_write() first forces the readers to use rw_semaphore and
increment the "slow" counter to take the lock for reading, then it
takes that rw_semaphore for writing and blocks the readers.
Also. With this patch the code relies on the documented behaviour of
synchronize_sched(), it doesn't try to pair synchronize_sched() with
barrier.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anton Arapov <anton@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are several places in the kernel that use functionality like
basename(3) with the exception: in case of '/foo/bar/' we expect to get an
empty string. Let's do it common helper for them.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: YAMANE Toshiaki <yamanetoshi@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This function finds the struct backlight_device for a given device tree
node. A dummy function is provided so that it safely compiles out if OF
support is disabled.
[akpm@linux-foundation.org: Don't use IS_ENABLED(CONFIG_OF)]
Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
Acked-by: Jingoo Han <jg1.han@samsung.com>
Reviewed-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Thierry Reding <thierry.reding@avionic-design.de>
Reviewed-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The LP855x family devices support the PWM input for the backlight control.
Period of the PWM is configurable in the platform side. Platform
specific functions are unnecessary anymore because generic PWM functions
are used inside the driver.
(PWM input mode)
To set the brightness, new lp855x_pwm_ctrl() is used.
If a PWM device is not allocated, devm_pwm_get() is called.
The PWM consumer name is from the chip name such as 'lp8550' and 'lp8556'.
To get the brightness value, no additional handling is required.
Just the value of 'props.brightness' is returned.
If the PWM driver is not ready while initializing the LP855x driver, it's
OK. The PWM device can be retrieved later, when the brightness value is
changed.
Documentation is updated with an example.
[akpm@linux-foundation.org: coding-style simplification, per Thierry]
Signed-off-by: Milo(Woogyom) Kim <milo.kim@ti.com>
Cc: Thierry Reding <thierry.reding@avionic-design.de>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Bryan Wu <bryan.wu@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
But the kernel decided to call it "origin" instead. Fix most of the
sites.
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently the __define_initcall() macro takes three arguments, fn, id and
level. The level argument is exactly the same as the id argument but
wrapped in quotes. To overcome this need to specify three arguments to
the __define_initcall macro, where one argument is the stringification of
another, we can just use the stringification macro instead.
Signed-off-by: Matthew Leach <matthew@mattleach.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull user namespace changes from Eric Biederman:
"While small this set of changes is very significant with respect to
containers in general and user namespaces in particular. The user
space interface is now complete.
This set of changes adds support for unprivileged users to create user
namespaces and as a user namespace root to create other namespaces.
The tyranny of supporting suid root preventing unprivileged users from
using cool new kernel features is broken.
This set of changes completes the work on setns, adding support for
the pid, user, mount namespaces.
This set of changes includes a bunch of basic pid namespace
cleanups/simplifications. Of particular significance is the rework of
the pid namespace cleanup so it no longer requires sending out
tendrils into all kinds of unexpected cleanup paths for operation. At
least one case of broken error handling is fixed by this cleanup.
The files under /proc/<pid>/ns/ have been converted from regular files
to magic symlinks which prevents incorrect caching by the VFS,
ensuring the files always refer to the namespace the process is
currently using and ensuring that the ptrace_mayaccess permission
checks are always applied.
The files under /proc/<pid>/ns/ have been given stable inode numbers
so it is now possible to see if different processes share the same
namespaces.
Through the David Miller's net tree are changes to relax many of the
permission checks in the networking stack to allowing the user
namespace root to usefully use the networking stack. Similar changes
for the mount namespace and the pid namespace are coming through my
tree.
Two small changes to add user namespace support were commited here adn
in David Miller's -net tree so that I could complete the work on the
/proc/<pid>/ns/ files in this tree.
Work remains to make it safe to build user namespaces and 9p, afs,
ceph, cifs, coda, gfs2, ncpfs, nfs, nfsd, ocfs2, and xfs so the
Kconfig guard remains in place preventing that user namespaces from
being built when any of those filesystems are enabled.
Future design work remains to allow root users outside of the initial
user namespace to mount more than just /proc and /sys."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (38 commits)
proc: Usable inode numbers for the namespace file descriptors.
proc: Fix the namespace inode permission checks.
proc: Generalize proc inode allocation
userns: Allow unprivilged mounts of proc and sysfs
userns: For /proc/self/{uid,gid}_map derive the lower userns from the struct file
procfs: Print task uids and gids in the userns that opened the proc file
userns: Implement unshare of the user namespace
userns: Implent proc namespace operations
userns: Kill task_user_ns
userns: Make create_new_namespaces take a user_ns parameter
userns: Allow unprivileged use of setns.
userns: Allow unprivileged users to create new namespaces
userns: Allow setting a userns mapping to your current uid.
userns: Allow chown and setgid preservation
userns: Allow unprivileged users to create user namespaces.
userns: Ignore suid and sgid on binaries if the uid or gid can not be mapped
userns: fix return value on mntns_install() failure
vfs: Allow unprivileged manipulation of the mount namespace.
vfs: Only support slave subtrees across different user namespaces
vfs: Add a user namespace reference from struct mnt_namespace
...
Pull block driver update from Jens Axboe:
"Now that the core bits are in, here are the driver bits for 3.8. The
branch contains:
- A huge pile of drbd bits that were dumped from the 3.7 merge
window. Following that, it was both made perfectly clear that
there is going to be no more over-the-wall pulls and how the
situation on individual pulls can be improved.
- A few cleanups from Akinobu Mita for drbd and cciss.
- Queue improvement for loop from Lukas. This grew into adding a
generic interface for waiting/checking an even with a specific
lock, allowing this to be pulled out of md and now loop and drbd is
also using it.
- A few fixes for xen back/front block driver from Roger Pau Monne.
- Partition improvements from Stephen Warren, allowing partiion UUID
to be used as an identifier."
* 'for-3.8/drivers' of git://git.kernel.dk/linux-block: (609 commits)
drbd: update Kconfig to match current dependencies
drbd: Fix drbdsetup wait-connect, wait-sync etc... commands
drbd: close race between drbd_set_role and drbd_connect
drbd: respect no-md-barriers setting also when changed online via disk-options
drbd: Remove obsolete check
drbd: fixup after wait_even_lock_irq() addition to generic code
loop: Limit the number of requests in the bio list
wait: add wait_event_lock_irq() interface
xen-blkfront: free allocated page
xen-blkback: move free persistent grants code
block: partition: msdos: provide UUIDs for partitions
init: reduce PARTUUID min length to 1 from 36
block: store partition_meta_info.uuid as a string
cciss: use check_signature()
cciss: cleanup bitops usage
drbd: use copy_highpage
drbd: if the replication link breaks during handshake, keep retrying
drbd: check return of kmalloc in receive_uuids
drbd: Broadcast sync progress no more often than once per second
drbd: don't try to clear bits once the disk has failed
...
This reverts commit 8fa72d234d.
People disagree about how this should be done, so let's revert this for
now so that nobody starts using the new tuning interface. Tejun is
thinking about a more generic interface for thread pool affinity.
Requested-by: Tejun Heo <tj@kernel.org>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull block layer core updates from Jens Axboe:
"Here are the core block IO bits for 3.8. The branch contains:
- The final version of the surprise device removal fixups from Bart.
- Don't hide EFI partitions under advanced partition types. It's
fairly wide spread these days. This is especially dangerous for
systems that have both msdos and efi partition tables, where you
want to keep them in sync.
- Cleanup of using -1 instead of the proper NUMA_NO_NODE
- Export control of bdi flusher thread CPU mask and default to using
the home node (if known) from Jeff.
- Export unplug tracepoint for MD.
- Core improvements from Shaohua. Reinstate the recursive merge, as
the original bug has been fixed. Add plugging for discard and also
fix a problem handling non pow-of-2 discard limits.
There's a trivial merge in block/blk-exec.c due to a fix that went
into 3.7-rc at a later point than -rc4 where this is based."
* 'for-3.8/core' of git://git.kernel.dk/linux-block:
block: export block_unplug tracepoint
block: add plug for blkdev_issue_discard
block: discard granularity might not be power of 2
deadline: Allow 0ms deadline latency, increase the read speed
partitions: enable EFI/GPT support by default
bsg: Remove unused function bsg_goose_queue()
block: Make blk_cleanup_queue() wait until request_fn finished
block: Avoid scheduling delayed work on a dead queue
block: Avoid that request_fn is invoked on a dead queue
block: Let blk_drain_queue() caller obtain the queue lock
block: Rename queue dead flag
bdi: add a user-tunable cpu_list for the bdi flusher threads
block: use NUMA_NO_NODE instead of -1
block: recursive merge requests
block CFQ: avoid moving request to different queue
Pull DRM updates from Dave Airlie:
"This is the one and only next pull for 3.8, we had a regression we
found last week, so I was waiting for that to resolve itself, and I
ended up with some Intel fixes on top as well.
Highlights:
- new driver: nvidia tegra 20/30/hdmi support
- radeon: add support for previously unused DMA engines, more HDMI
regs, eviction speeds ups and fixes
- i915: HSW support enable, agp removal on GEN6, seqno wrapping
- exynos: IPP subsystem support (image post proc), HDMI
- nouveau: display class reworking, nv20->40 z compression
- ttm: start of locking fixes, rcu usage for lookups,
- core: documentation updates, docbook integration, monotonic clock
usage, move from connector to object properties"
* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (590 commits)
drm/exynos: add gsc ipp driver
drm/exynos: add rotator ipp driver
drm/exynos: add fimc ipp driver
drm/exynos: add iommu support for ipp
drm/exynos: add ipp subsystem
drm/exynos: support device tree for fimd
radeon: fix regression with eviction since evict caching changes
drm/radeon: add more pedantic checks in the CP DMA checker
drm/radeon: bump version for CS ioctl support for async DMA
drm/radeon: enable the async DMA rings in the CS ioctl
drm/radeon: add VM CS parser support for async DMA on cayman/TN/SI
drm/radeon/kms: add evergreen/cayman CS parser for async DMA (v2)
drm/radeon/kms: add 6xx/7xx CS parser for async DMA (v2)
drm/radeon: fix htile buffer size computation for command stream checker
drm/radeon: fix fence locking in the pageflip callback
drm/radeon: make indirect register access concurrency-safe
drm/radeon: add W|RREG32_IDX for MM_INDEX|DATA based mmio accesss
drm/exynos: support extended screen coordinate of fimd
drm/exynos: fix x, y coordinates for right bottom pixel
drm/exynos: fix fb offset calculation for plane
...
We have several new drivers, most of the time coming with their sub devices
drivers:
- Austria Microsystem's AS3711
- Nano River's viperboard
- TI's TPS80031, AM335x TS/ADC,
- Realtek's MMC/memstick card reader
- Nokia's retu
We also got some notable cleanups and improvements:
- tps6586x got converted to IRQ domains.
- tps65910 and tps65090 moved to the regmap IRQ API.
- STMPE is now Device Tree aware.
- A general twl6040 and twl-core cleanup, with moves to the regmap I/O and IRQ
APIs and a conversion to the recently added PWM framework.
- sta2x11 gained regmap support.
Then the rest is mostly tiny cleanups and fixes, among which we have Mark's
wm5xxx and wm8xxx patchset.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJQznPnAAoJEIqAPN1PVmxKuA8P/0nOJduXFM1c0Gy+DD5DnJnG
cXzzeSTV8iO3a3sHIye43QPJ5V2YUR5uxLTUEOo/G7my/MoZ/azeNidkUD3qLVlm
wVIq35lcS8dWTZaY7nlpBcWc6e39UB0sEueuJNxyhOv5lnMKdi2tAow5f4vIRQnd
Q67/EbrgqdltcOpGmVuCdQcvphvWgy+K65jzbJG5zXs7hGX13Q+M5RnYhx76kc8f
TDd0APZ71n5/RyISFSBSu2vfl2kES6o47aMgqqXMEHri6d3puAaXM0rFoMzXg/4G
eBdxndN25H7rW7xvt9tuUod2rn1AO7tif5d7jal3Cfj61y3iqKY30yb3OzS9XQXH
9WZ2qDst11zvzQivxIkMGvfRXRfncNLWR4DrBSqVfSbYV2uQj2eS8C6ONwKVMXsQ
5tjNp91PFqN19sWQjIjSMcrNswxgpvdQ9mqFTyOGmISbqrpPSTi+MuO8r9+xTfUF
PnzUX2nVOW/i9NcI7uotjzh8jiw6t8XMVHhkehiSYR9hzCb6MaPsFPN4jWq9XA2m
1htCHylNpHqHQ3Mup7Is6j0Li1ahdwfm4lbrgiVEA4t4Mqs5E/Ka+3V8laNAKylW
PfCP/VmnJYzmgVTK/qobFNeKzRqR0i4WTL6T7oAxGL87Q4TJaqKpEkXWne8UXV+Q
yIbN0fmWfCveCetM+vaf
=F790
-----END PGP SIGNATURE-----
Merge tag 'mfd-3.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6
Pull MFS update from Samuel Ortiz:
"This is the MFD patch set for the 3.8 merge window.
We have several new drivers, most of the time coming with their sub
devices drivers:
- Austria Microsystem's AS3711
- Nano River's viperboard
- TI's TPS80031, AM335x TS/ADC,
- Realtek's MMC/memstick card reader
- Nokia's retu
We also got some notable cleanups and improvements:
- tps6586x got converted to IRQ domains.
- tps65910 and tps65090 moved to the regmap IRQ API.
- STMPE is now Device Tree aware.
- A general twl6040 and twl-core cleanup, with moves to the regmap
I/O and IRQ APIs and a conversion to the recently added PWM
framework.
- sta2x11 gained regmap support.
Then the rest is mostly tiny cleanups and fixes, among which we have
Mark's wm5xxx and wm8xxx patchset."
Far amount of annoying but largely trivial conflicts. Many due to
__devinit/exit removal, others due to one or two of the new drivers also
having come in through another tree.
* tag 'mfd-3.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: (119 commits)
mfd: tps6507x: Convert to devm_kzalloc
mfd: stmpe: Update DT support for stmpe driver
mfd: wm5102: Add readback of DSP status 3 register
mfd: arizona: Log if we fail to create the primary IRQ domain
mfd: tps80031: MFD_TPS80031 needs to select REGMAP_IRQ
mfd: tps80031: Add terminating entry for tps80031_id_table
mfd: sta2x11: Fix potential NULL pointer dereference in __sta2x11_mfd_mask()
mfd: wm5102: Add tuning for revision B
mfd: arizona: Defer patch initialistation until after first device boot
mfd: tps65910: Fix wrong ack_base register
mfd: tps65910: Remove unused data
mfd: stmpe: Get rid of irq_invert_polarity
mfd: ab8500-core: Fix invalid free of devm_ allocated data
mfd: wm5102: Mark DSP memory regions as volatile
mfd: wm5102: Correct default for LDO1_CONTROL_2
mfd: arizona: Register haptics devices
mfd: wm8994: Make current device behaviour the default
mfd: tps65090: MFD_TPS65090 needs to select REGMAP_IRQ
mfd: Fix stmpe.c build when OF is not enabled
mfd: jz4740-adc: Use devm_kzalloc
...
- Use dma addresses instead of the virt_to_phys and vice versa functions.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQEcBAABAgAGBQJQy9RFAAoJEFjIrFwIi8fJaYwH/0aC5xbJYSNubBI5opZe7APC
b1LzHpV1yXyIZc2wOqdhzFmmrzRLVHULnfG/oOF9/VaAjrxWeZVDs54DF33Xao9M
pOFLrY9kwK+AB5DPm0aBzF7JOS8Fd+alNM7Dork3mtsxbqqX3fx0OHvNTZJVdrj8
ToYVoTiz7dpXC1IIgBoJf6XKrwqvcQrZc7yXgz0A8ZZhCUZSBMHV1Lifx/8xzpQ+
bgfTSsH6CwejlpSXWQ6jfsF2C/Ku/nAWOKrpVZleb08svgPR/UA8Nhkjlaoj2xbp
zq/aWnpaF4fKiYmY6qrRGo+ABcUHua7Iaco8ZjNCyrIPfInSqhxua/Ajzk1xtNU=
=QiGx
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.8-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb
Pull swiotlb update from Konrad Rzeszutek Wilk:
"Feature:
- Use dma addresses instead of the virt_to_phys and vice versa
functions.
Remove the multitude of phys_to_virt/virt_to_phys calls and instead
operate on the physical addresses instead of virtual in many of the
internal functions. This does provide a speed up in interrupt
handlers that do DMA operations and use SWIOTLB."
* tag 'stable/for-linus-3.8-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
swiotlb: Do not export swiotlb_bounce since there are no external consumers
swiotlb: Use physical addresses instead of virtual in swiotlb_tbl_sync_single
swiotlb: Use physical addresses for swiotlb_tbl_unmap_single
swiotlb: Return physical addresses when calling swiotlb_tbl_map_single
swiotlb: Make io_tlb_overflow_buffer a physical address
swiotlb: Make io_tlb_start a physical address instead of a virtual one
swiotlb: Make io_tlb_end a physical address instead of a virtual one
inline data, which allows small files or directories to be stored in
the in-inode extended attribute area. (This requires that the file
system use inodes which are at least 256 bytes or larger; 128 byte
inodes do not have any room for in-inode xattrs.)
The second new feature is SEEK_HOLE/SEEK_DATA support. This is
enabled by the extent status tree patches, and this infrastructure
will be used to further optimize ext4 in the future.
Beyond that, we have the usual collection of code cleanups and bug
fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABCAAGBQJQzTaLAAoJENNvdpvBGATwpqEQAM0WO9Kva3R8SoaD6NYOg4lN
8oxRlht6yogSd6wwYZm1c4YF9UrhloS9kHyWcH3Wmr9fhM5vig1ec12eDsDGrjBc
Wb+x+YrmczSJzK380JLxmYnVSXQVFl7/hNqaRowffTOJwgySmp8oLrI88ZcaCmVU
+qWG2x6eVhCEQrpin9Mv3D6pHkx2hfg9w5sB0K+kpgsdjqLZsmPRmxU9nx0nEJYC
gmbpo8Dcsfqra6DJosQGo7eFq7J3fm9v1ql+QOxOjc9/zD2XwdQE1JZImehvno5i
Ekwr9771fsw34/QHJebYRC/OkftmOn4OPuQejd+AKNdBR4mO8G/AsLCroD17uLNi
NrtMkE6ecJPb3SflarZruNYTUhJfj3H6V9P/8wggpyPzT3l19sqP+2F6GwZspZiV
EJb2iTKn0Phc2OD1MqO9gFP0g+IMH0kktYdxEf0V2QOQqhQHnPwxF+2Tp6bVQcQs
KCetN37y60qJ+zKH9xukcXmWQJvnjgmWqZqpomoA4lrwgKazTNDJJ+R+N+r5HKMj
5cz2ntAhF8FfPhqVf+8DHgjKNUwm6C++O1+Lb9swZ0FkFi5Ob3OlwWaC75Gf4H+P
2DslBapfM79bX14a9BKaBjly5FsAha7OzR+xo0MZN+fEcMLEk33kcRovcY8DHqxU
aadriOatYYixvSZ5lL3m
=aNOf
-----END PGP SIGNATURE-----
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 update from Ted Ts'o:
"There are two major features for this merge window. The first is
inline data, which allows small files or directories to be stored in
the in-inode extended attribute area. (This requires that the file
system use inodes which are at least 256 bytes or larger; 128 byte
inodes do not have any room for in-inode xattrs.)
The second new feature is SEEK_HOLE/SEEK_DATA support. This is
enabled by the extent status tree patches, and this infrastructure
will be used to further optimize ext4 in the future.
Beyond that, we have the usual collection of code cleanups and bug
fixes."
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (63 commits)
ext4: zero out inline data using memset() instead of empty_zero_page
ext4: ensure Inode flags consistency are checked at build time
ext4: Remove CONFIG_EXT4_FS_XATTR
ext4: remove unused variable from ext4_ext_in_cache()
ext4: remove redundant initialization in ext4_fill_super()
ext4: remove redundant code in ext4_alloc_inode()
ext4: use sync_inode_metadata() when syncing inode metadata
ext4: enable ext4 inline support
ext4: let fallocate handle inline data correctly
ext4: let ext4_truncate handle inline data correctly
ext4: evict inline data out if we need to strore xattr in inode
ext4: let fiemap work with inline data
ext4: let ext4_rename handle inline dir
ext4: let empty_dir handle inline dir
ext4: let ext4_delete_entry() handle inline data
ext4: make ext4_delete_entry generic
ext4: let ext4_find_entry handle inline data
ext4: create a new function search_dir
ext4: let ext4_readdir handle inline data
ext4: let add_dir_entry handle inline data properly
...
Pull security subsystem updates from James Morris:
"A quiet cycle for the security subsystem with just a few maintenance
updates."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
Smack: create a sysfs mount point for smackfs
Smack: use select not depends in Kconfig
Yama: remove locking from delete path
Yama: add RCU to drop read locking
drivers/char/tpm: remove tasklet and cleanup
KEYS: Use keyring_alloc() to create special keyrings
KEYS: Reduce initial permissions on keys
KEYS: Make the session and process keyrings per-thread
seccomp: Make syscall skipping and nr changes more consistent
key: Fix resource leak
keys: Fix unreachable code
KEYS: Add payload preparsing opportunity prior to key instantiate or update
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQIcBAABAgAGBQJQx0kQAAoJEHzG/DNEskfi4fQP/R5PRovayroZALBMLnVJDaLD
Ttr9p40VNXbiJ+MfRgatJjSSJZ4Jl+fC3NEqBhcwVZhckZZb9R2s0WtrSQo5+ZbB
vdRfiuKoCaKM4cSZ08C12uTvsF6xjhjd27CTUlMkyOcDoKxMEFKelv0hocSxe4Wo
xqlv3eF+VsY7kE1BNbgBP06SX4tDpIHRxXfqJPMHaSKQmre+cU0xG2GcEu3QGbHT
DEDTI788YSaWLmBfMC+kWoaQl1+bV/FYvavIAS8/o4K9IKvgR42VzrXmaFaqrbgb
72ksa6xfAi57yTmZHqyGmts06qYeBbPpKI+yIhCMInxA9CY3lPbvHppRf0RQOyzj
YOi4hovGEMJKE+BCILukhJcZ9jCTtS3zut6v1rdvR88f4y7uhR9RfmRfsxuW7PNj
3Rmh191+n0lVWDmhOs2psXuCLJr3LEiA0dFffN1z8REUTtTAZMsj8Rz+SvBNAZDR
hsJhERVeXB6X5uQ5rkLDzbn1Zic60LjVw7LIp6SF2OYf/YKaF8vhyWOA8dyCEu8W
CGo7AoG0BO8tIIr8+LvFe8CweypysZImx4AjCfIs4u9pu/v11zmBvO9NO5yfuObF
BreEERYgTes/UITxn1qdIW4/q+Nr0iKO3CTqsmu6L1GfCz3/XzPGs3U26fUhllqi
Ka0JKgnWvsa6ez6FSzKI
=ivQa
-----END PGP SIGNATURE-----
Merge tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma
Pull Automatic NUMA Balancing bare-bones from Mel Gorman:
"There are three implementations for NUMA balancing, this tree
(balancenuma), numacore which has been developed in tip/master and
autonuma which is in aa.git.
In almost all respects balancenuma is the dumbest of the three because
its main impact is on the VM side with no attempt to be smart about
scheduling. In the interest of getting the ball rolling, it would be
desirable to see this much merged for 3.8 with the view to building
scheduler smarts on top and adapting the VM where required for 3.9.
The most recent set of comparisons available from different people are
mel: https://lkml.org/lkml/2012/12/9/108
mingo: https://lkml.org/lkml/2012/12/7/331
tglx: https://lkml.org/lkml/2012/12/10/437
srikar: https://lkml.org/lkml/2012/12/10/397
The results are a mixed bag. In my own tests, balancenuma does
reasonably well. It's dumb as rocks and does not regress against
mainline. On the other hand, Ingo's tests shows that balancenuma is
incapable of converging for this workloads driven by perf which is bad
but is potentially explained by the lack of scheduler smarts. Thomas'
results show balancenuma improves on mainline but falls far short of
numacore or autonuma. Srikar's results indicate we all suffer on a
large machine with imbalanced node sizes.
My own testing showed that recent numacore results have improved
dramatically, particularly in the last week but not universally.
We've butted heads heavily on system CPU usage and high levels of
migration even when it shows that overall performance is better.
There are also cases where it regresses. Of interest is that for
specjbb in some configurations it will regress for lower numbers of
warehouses and show gains for higher numbers which is not reported by
the tool by default and sometimes missed in treports. Recently I
reported for numacore that the JVM was crashing with
NullPointerExceptions but currently it's unclear what the source of
this problem is. Initially I thought it was in how numacore batch
handles PTEs but I'm no longer think this is the case. It's possible
numacore is just able to trigger it due to higher rates of migration.
These reports were quite late in the cycle so I/we would like to start
with this tree as it contains much of the code we can agree on and has
not changed significantly over the last 2-3 weeks."
* tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma: (50 commits)
mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable
mm/rmap: Convert the struct anon_vma::mutex to an rwsem
mm: migrate: Account a transhuge page properly when rate limiting
mm: numa: Account for failed allocations and isolations as migration failures
mm: numa: Add THP migration for the NUMA working set scanning fault case build fix
mm: numa: Add THP migration for the NUMA working set scanning fault case.
mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node
mm: sched: numa: Control enabling and disabling of NUMA balancing if !SCHED_DEBUG
mm: sched: numa: Control enabling and disabling of NUMA balancing
mm: sched: Adapt the scanning rate if a NUMA hinting fault does not migrate
mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task<->node relationships
mm: numa: migrate: Set last_nid on newly allocated page
mm: numa: split_huge_page: Transfer last_nid on tail page
mm: numa: Introduce last_nid to the page frame
sched: numa: Slowly increase the scanning period as NUMA faults are handled
mm: numa: Rate limit setting of pte_numa if node is saturated
mm: numa: Rate limit the amount of memory that is migrated between nodes
mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting
mm: numa: Migrate pages handled during a pmd_numa hinting fault
mm: numa: Migrate on reference policy
...
This reverts commit bd52276fa1 ("x86-64/efi: Use EFI to deal with
platform wall clock (again)"), and the two supporting commits:
da5a108d05b4: "x86/kernel: remove tboot 1:1 page table creation code"
185034e72d59: "x86, efi: 1:1 pagetable mapping for virtual EFI calls")
as they all depend semantically on commit 53b87cf088 ("x86, mm:
Include the entire kernel memory map in trampoline_pgd") that got
reverted earlier due to the problems it caused.
This was pointed out by Yinghai Lu, and verified by me on my Macbook Air
that uses EFI.
Pointed-out-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here are some drivers/extcon/ patches that I forgot to have you pull in
the larger char/misc patchset from yesterday, for the 3.8-rc1 kernel.
Nothing major here, just some driver updates, and cleanups, all of which
have been in linux-next for a while now.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iEYEABECAAYFAlDJGnYACgkQMUfUDdst+ykxBgCaAgowyu5iGOIlDf5+F8KL8fsm
T3gAoKt+g3zaN3A3zfPLMr6zeKlUNF+T
=xiFu
-----END PGP SIGNATURE-----
Merge tag 'char-misc-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull EXTCON patches from Greg Kroah-Hartman:
"Here are some drivers/extcon/ patches that I forgot to have you pull
in the larger char/misc patchset from yesterday, for the 3.8-rc1
kernel.
Nothing major here, just some driver updates, and cleanups, all of
which have been in linux-next for a while now.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'char-misc-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
extcon: kernel_doc style fix
extcon: max77693: Fix uninitialised variable warning
extcon: max77693: Use devm_kzalloc
extcon: max8997: Use devm_kzalloc
extcon: max8997: Fix a typo
extcon: max8997: Fix checkpatch error
extcon: max77693: Fix coding style
extcon: max77693: Fix incorrect error check and return value
extcon: max8997: Fix incorrect error check and return value
extcon: Fix return value in extcon-class.c
extcon: Add missing header file to extcon.h
extcon: arizona: unlock mutex on error path in arizona_micdet()
OMAPDSS changes, including:
- use dynanic debug prints
- OMAP platform dependency removals
- Creation of compat-layer, helping us to improve omapdrm
- Misc cleanups, aiming to make omadss more in line with the upcoming common
display framework
Exynos DP changes for the 3.8 merge window:
- Device Tree support for Samsung Exynos DP
- SW Link training is cleaned up.
- HPD interrupt is supported.
Samsung Framebuffer changes for the 3.8 merge window:
- The bit definitions of header file are updated.
- Some minor typos are fixed.
- Some minor bugs of s3c_fb_check_var() are fixed.
FB related changes for SH Mobile, Freescale DIU
Add support for the Solomon SSD1307 OLED Controller
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQyvniAAoJEPo9qoy8lh71IqkQAK9LkVjVk6UQRT2/b39hPF6i
PKPRNYxpLxX5u98hiqQfweukGkuC/cmsAJOh5UwX1lFg0TNua8CEYg0iuHgUtLWE
gi+HNUhHfGx1+a9yQpTKsiheq7r88IjpghFGahnFhpE4kDzhIROom7B39x8bx9gs
1vnCptmcIq0IlMlmjIktTHw8R36Fd/8xDVEhEtDccxT9AJSePnKBY8SBuRjiyJfP
QvXnd131kmHdshQbSVetsEwI1a9NJ9Z6MMRiQK3vtDiBJ7QHUw8wD/+kdR68gQxZ
NqTC/s478dvx/bR5CIXHrXxQZGfOCJ0GFzT1rlcRMxECcF0dQYNr3Hid3Zas+mPZ
hABpEJE7nV7LoDGhW3oIWPt6MZVFED/HE0F5QrG3/PWYs5UpTwkI9UTkB8Bx6LcA
H9ypHzFaOT1SK2/jln9pAfYgo/ITCRf08gcxBvGZUPcybrQy+kbVPdYgDn09cDhR
u6ax3k/PMF/6JRdpVRNxLaoEHw/G6vtNjoeRJivZzuIOmGDfGwADGcEnrcZEM3NM
muOjk8TfTJgK+gzWleOfr9CQGiWWp1Tht9vF6TSApq4SEqmOK81EZpucrMXY0t6r
9+JmFzS1/+75NeY2YjgHCuqRaqVSMiyyOp1eYV9JIY14rKINbj77OHX8xaMm+Leg
ZU4tirie/oEALaIJUann
=KVPX
-----END PGP SIGNATURE-----
Merge tag 'fbdev-for-3.8' of git://gitorious.org/linux-omap-dss2/linux
Pull fbdev changes from Tomi Valkeinen:
"OMAPDSS changes, including:
- use dynanic debug prints
- OMAP platform dependency removals
- Creation of compat-layer, helping us to improve omapdrm
- Misc cleanups, aiming to make omadss more in line with the upcoming
common display framework
Exynos DP changes for the 3.8 merge window:
- Device Tree support for Samsung Exynos DP
- SW Link training is cleaned up.
- HPD interrupt is supported.
Samsung Framebuffer changes for the 3.8 merge window:
- The bit definitions of header file are updated.
- Some minor typos are fixed.
- Some minor bugs of s3c_fb_check_var() are fixed.
FB related changes for SH Mobile, Freescale DIU
Add support for the Solomon SSD1307 OLED Controller"
* tag 'fbdev-for-3.8' of git://gitorious.org/linux-omap-dss2/linux: (191 commits)
OMAPDSS: fix TV-out issue with DSI PLL
Revert "OMAPFB: simplify locking"
OMAPFB: remove silly loop in fb2display()
OMAPFB: fix error handling in omapfb_find_best_mode()
OMAPFB: use devm_kzalloc to allocate omapfb2_device
OMAPDSS: DISPC: remove dispc fck uses
OMAPDSS: DISPC: get dss clock rate from dss driver
drivers/video/console/softcursor.c: remove redundant NULL check before kfree()
drivers/video: add support for the Solomon SSD1307 OLED Controller
OMAPDSS: use omapdss_compat_init() in other drivers
OMAPDSS: export dispc functions
OMAPDSS: export dss_feat functions
OMAPDSS: export dss_mgr_ops functions
OMAPDSS: separate compat files in the Makefile
OMAPDSS: move display sysfs init to compat layer
OMAPDSS: DPI: use dispc's check_timings
OMAPDSS: DISPC: add dispc_ovl_check()
OMAPDSS: move irq handling to dispc-compat
OMAPDSS: move omap_dispc_wait_for_irq_interruptible_timeout to dispc-compat.c
OMAPDSS: move blocking mgr enable/disable to compat layer
...
Conflicts:
arch/arm/mach-davinci/devices-da8xx.c
arch/arm/plat-omap/common.c
drivers/media/platform/omap/omap_vout.c
Shave a few bytes off the slot table size by moving the RPC timestamp
into the sequence results.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
constification of node argument to of_parse_phandle_with_args().
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQy6rNAAoJEEFnBt12D9kBbaAQALQM65g8FIqTo4qzBqKHza9f
1PM+EfqsRVzngw7vUSXd0ui54MtUnpxSrqxywHj7cPNkHwq+6UBFDXeD5oJI1i0H
IYhq/Ne40aFPAvA/W3+jqtwjH7j6An77c9P5L4CMddSgnIVaHf4n6YEpXgSIJiVz
FXvlHljOUK+KAggc/HHgAOymeLALcLKqSSf11mSDHi1iBeEjrNyZpKSAzHAg44pf
tE84yLCQXDbt2id8YYxXE6rIb6IwHBNSDY6YcxZWWHZqq5nNDyYTV4NqURViAu9+
KJy1pY578cvYNJYcuaOwZZpuN7bPLlfjNO0mgQnl7Z0wzDFr5LBDCJ8/BGmmUKmS
ryWYDXMWFCvmDfa6H/uzcrgI7V1P3cG2t7B11Bvil6axTUUpFRqi6SyrB9jr+hrS
gA+LnMo0IpSy+jePStAalFUyKK/8fOfgAG0bZrK4taU9qnLDlijfTGg3CBqEoMWD
dssutlHW7YIGM1UC00vUARQ0rWUFDF5HAqVaE6OfCOvNqfDwrZMuF1gtg0ycWEaY
UNbT1KvOmtu4Ypi6Jkv9NPLW9GfLmTuAPRYhusIUTPQJTS6o4ZflJ9jeohS69kLX
63qtMD/bpy8z5fYgS5D+EyNFxWmFKQfm3SDlHIIaouMMqd2IEr6CTRMisvZEmEnx
qnt9fKKQkoPtwWPLG5ND
=Zs/W
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6
Pull another devicetree update from Grant Likely:
"Here's a couple more devicetree changes that I missed in the first
pull by putting the tag in the wrong place.
Two minor devicetree fixups for v3.8. Addition of dummy inlines and
constification of node argument to of_parse_phandle_with_args()."
* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6:
of: *node argument to of_parse_phandle_with_args should be const
of/i2c: add dummy inline functions for when CONFIG_OF_I2C(_MODULE) isn't defined
This is a branch with updates for Marvell's mvebu/kirkwood platforms. They
came in late-ish, and were heavily interdependent such that it didn't
make sense to split them up across the cross-platform topic branches. So
here they are (for the second release in a row) in a branch on their own.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQy5i9AAoJEIwa5zzehBx3ZskP/2wxjbwEaNdnR+7j8595bTaa
GYq8qJ4lUCOKmSqp3bQkg/Plm2D88p78BO5qTm2io527gl10HemzCiGaejclujIw
sDFZPAE8K0Z8p0gQcBNlRZNuI3J1N6IKRqYH5SIJ2vWmBMfO7nKRR9nmTiDpm5bx
IcuKX2u/mhyXWN+F0EcHqcupH1K+mdzyGdIQk80Tyqni+UTN+pd0efLM6WL4SFJM
5fj64dDFpVDA8t+O2Avz8p+lx07vkSy2wIXWt7Ik9BVtsyZQecn+9lpl8FvcrSK/
MgL3QO4kqDpJDs88M7DJURU1/EdsWZc32M63avctaWnGWItQAbOJYBDmZTlng08x
ZGrKOgf/I6le7wEpnzdag9ymI/rAL8I0755FkfXxf1R7/X40b+t8/61J/ddOKTDs
1sTVt+eKyyIMWle4V4zENa03goVBApCIEXcmnuFisFNbBY6azV31inJEp/3PvpgE
GeMBfxBDkvn+03LkRFcZlhTeDsNTdctD+sfgrNPaQf5bZGIvEz87vgfNTIiaU3GA
Vd5aiainVDQgmpoFfRG6391gdFlF2l9d67LoG4ClCjn4WL+UxcTRuzBW/liORpUO
E7CwMHtPq6eoGKywiKMFRzY2QRIKZRkxrC2PCJ/1V9mbIGwgaD/3BQ2/czwrnc8q
1gnxWx8E5SKEGcDJXD+6
=7luC
-----END PGP SIGNATURE-----
Merge tag 'mvebu' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC updates for Marvell mvebu/kirkwood from Olof Johansson:
"This is a branch with updates for Marvell's mvebu/kirkwood platforms.
They came in late-ish, and were heavily interdependent such that it
didn't make sense to split them up across the cross-platform topic
branches. So here they are (for the second release in a row) in a
branch on their own."
* tag 'mvebu' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (88 commits)
arm: l2x0: add aurora related properties to OF binding
arm: mvebu: add Aurora L2 Cache Controller to the DT
arm: mvebu: add L2 cache support
dma: mv_xor: fix error handling path
dma: mv_xor: fix error checking of irq_of_parse_and_map()
dma: mv_xor: use request_irq() instead of devm_request_irq()
dma: mv_xor: clear the window override control registers
arm: mvebu: fix address decoding armada_cfg_base() function
ARM: mvebu: update defconfig with I2C and RTC support
ARM: mvebu: Add SATA support for OpenBlocks AX3-4
ARM: mvebu: Add support for the RTC in OpenBlocks AX3-4
ARM: mvebu: Add support for I2C on OpenBlocks AX3-4
ARM: mvebu: Add support for I2C controllers in Armada 370/XP
arm: mvebu: Add hardware I/O Coherency support
arm: plat-orion: Add coherency attribute when setup mbus target
arm: dma mapping: Export a dma ops function arm_dma_set_mask
arm: mvebu: Add SMP support for Armada XP
arm: mm: Add support for PJ4B cpu and init routines
arm: mvebu: Add IPI support via doorbells
arm: mvebu: Add initial support for power managmement service unit
...
We need to move this file to allow ARM multiplatform configurations
to build for omap2+. This can now be done as this file now only
contains platform_data.
cc: Russell King <linux@arm.linux.org.uk>
cc: Alan Cox <alan@linux.intel.com>
cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cc: Govindraj.R <govindraj.raja@ti.com>
cc: Kevin Hilman <khilman@ti.com>
cc: linux-serial@vger.kernel.org
Reviewed-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
This branch contains device-tree updates for the SPEAr platform.
They had dependencies on earlier branches from this merge window,
which is why they were broken out in a separate branch.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQy5h1AAoJEIwa5zzehBx3TukP/ijMApMtSf5NuGfjeHz0eh18
JdWVa+89Bgrf1ggVWrnrdfMjHVjsQlu/pwGQF9BY3QmmVlyQIjcxEUMmZ2tw69rw
wMu17S40Hv746XHB7zkg8HDMzYL7btwxUeo8hX4NUQZSaL82VVB/RndZyIkWQ11q
FqUG1CS4tunw75sedbJI0NukV9rHF29XYTfAKOqpuBbHC+fNAZDSZIoHNQSJmdQi
lUc8uw0WD9cZBJN+yt4qKUv6B07Tkc8yPCKiDj+onwEddWahHsLEO+q2e15Llb6x
qThG2tgtC4WqqrGyxjZHgIwy/uis6v0jerN9CbUTOUeJP3XosC9xg4novWvH6vrK
K3iQnC5Pl7FGIxIjJoLbttbkVTz9SqLuFa31cAdjTPODC2D7trPKqFoJMQA2DjU3
XqTJc/Q6Xbgee1qGWjdulDTFBK6zsqbdSC/uB/Ro6FRsOToypdB4AVQo4gl9I5g3
dUC6ZghgOlbff0pFazSsfMDQOmzyBq4694Xnosq7C/KkLL1HyRe8qqEg/gBfPcQZ
NKXzoIrGIR1pTs9VN00ScBMWtbJWd9OkVMc3aGUQZA+gIUE2ExNfzls0n+1pAhAb
god1Aq0s1ZWqGSeguENRGMN/lUYYN8TCeOlTIY3eKPxw9ybwSxOdCyH2iLlXOYHj
l1J6msYWvhULWpLQST7r
=iK4W
-----END PGP SIGNATURE-----
Merge tag 'dt2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC device-tree updates, take 2, from Olof Johansson:
"This branch contains device-tree updates for the SPEAr platform. They
had dependencies on earlier branches from this merge window, which is
why they were broken out in a separate branch."
* tag 'dt2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: SPEAr3xx: Shirq: Move shirq controller out of plat/
ARM: SPEAr320: DT: Add SPEAr 320 HMI board support
ARM: SPEAr3xx: DT: add shirq node for interrupt multiplexor
ARM: SPEAr3xx: shirq: simplify and move the shared irq multiplexor to DT
ARM: SPEAr1310: Fix AUXDATA for compact flash controller
ARM: SPEAr13xx: Remove fields not required for ssp controller
ARM: SPEAr1310: Move 1310 specific misc register into machine specific files
ARM: SPEAr: DT: Update device nodes
ARM: SPEAr: DT: add uart state to fix warning
ARM: SPEAr: DT: Modify DT bindings for STMMAC
ARM: SPEAr: DT: Fix existing DT support
ARM: SPEAr: DT: Update partition info for MTD devices
ARM: SPEAr: DT: Update pinctrl list
ARM: SPEAr13xx: DT: Add spics gpio controller nodes
Pull MIPS updates from Ralf Baechle:
"The MIPS bits for 3.8. This also includes a bunch fixes that were
sitting in the linux-mips.org git tree for a long time. This pull
request contains updates to several OCTEON drivers and the board
support code for BCM47XX, BCM63XX, XLP, XLR, XLS, lantiq, Loongson1B,
updates to the SSB bus support, MIPS kexec code and adds support for
kdump.
When pulling this, there are two expected merge conflicts in
include/linux/bcma/bcma_driver_chipcommon.h which are trivial to
resolve, just remove the conflict markers and keep both alternatives."
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (90 commits)
MIPS: PMC-Sierra Yosemite: Remove support.
VIDEO: Newport Fix console crashes
MIPS: wrppmc: Fix build of PCI code.
MIPS: IP22/IP28: Fix build of EISA code.
MIPS: RB532: Fix build of prom code.
MIPS: PowerTV: Fix build.
MIPS: IP27: Correct fucked grammar in ops-bridge.c
MIPS: Highmem: Fix build error if CONFIG_DEBUG_HIGHMEM is disabled
MIPS: Fix potencial corruption
MIPS: Fix for warning from FPU emulation code
MIPS: Handle COP3 Unusable exception as COP1X for FP emulation
MIPS: Fix poweroff failure when HOTPLUG_CPU configured.
MIPS: MT: Fix build with CONFIG_UIDGID_STRICT_TYPE_CHECKS=y
MIPS: Remove unused smvp.h
MIPS/EDAC: Improve OCTEON EDAC support.
MIPS: OCTEON: Add definitions for OCTEON memory contoller registers.
MIPS: OCTEON: Add OCTEON family definitions to octeon-model.h
ata: pata_octeon_cf: Use correct byte order for DMA in when built little-endian.
MIPS/OCTEON/ata: Convert pata_octeon_cf.c to use device tree.
MIPS: Remove usage of CEVT_R4K_LIB config option.
...
Instead of using cpu_is_omap..() macros in the device driver we
rely on information provided in the platform data.
The only information we need is whether the USB Host module has
a single ULPI bypass control bit for all ports or individual bypass
control bits for each port. OMAP3 REV2.1 and earlier have the former.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
[tony@atomide.com: updated to remove plat/cpu.h]
Signed-off-by: Tony Lindgren <tony@atomide.com>
In MD raid case, discard granularity might not be power of 2, for example, a
4-disk raid5 has 3*chunk_size discard granularity. Correct the calculation for
such cases.
Reported-by: Neil Brown <neilb@suse.de>
Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull x86 EFI update from Peter Anvin:
"EFI tree, from Matt Fleming. Most of the patches are the new efivarfs
filesystem by Matt Garrett & co. The balance are support for EFI
wallclock in the absence of a hardware-specific driver, and various
fixes and cleanups."
* 'core-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
efivarfs: Make efivarfs_fill_super() static
x86, efi: Check table header length in efi_bgrt_init()
efivarfs: Use query_variable_info() to limit kmalloc()
efivarfs: Fix return value of efivarfs_file_write()
efivarfs: Return a consistent error when efivarfs_get_inode() fails
efivarfs: Make 'datasize' unsigned long
efivarfs: Add unique magic number
efivarfs: Replace magic number with sizeof(attributes)
efivarfs: Return an error if we fail to read a variable
efi: Clarify GUID length calculations
efivarfs: Implement exclusive access for {get,set}_variable
efivarfs: efivarfs_fill_super() ensure we clean up correctly on error
efivarfs: efivarfs_fill_super() ensure we free our temporary name
efivarfs: efivarfs_fill_super() fix inode reference counts
efivarfs: efivarfs_create() ensure we drop our reference on inode on error
efivarfs: efivarfs_file_read ensure we free data in error paths
x86-64/efi: Use EFI to deal with platform wall clock (again)
x86/kernel: remove tboot 1:1 page table creation code
x86, efi: 1:1 pagetable mapping for virtual EFI calls
x86, mm: Include the entire kernel memory map in trampoline_pgd
...
Pull x86 ACPI update from Peter Anvin:
"This is a patchset which didn't make the last merge window. It adds a
debugging capability to feed ACPI tables via the initramfs.
On a grander scope, it formalizes using the initramfs protocol for
feeding arbitrary blobs which need to be accessed early to the kernel:
they are fed first in the initramfs blob (lots of bootloaders can
concatenate this at boot time, others can use a single file) in an
uncompressed cpio archive using filenames starting with "kernel/".
The ACPI maintainers requested that this patchset be fed via the x86
tree rather than the ACPI tree as the footprint in the general x86
code is much bigger than in the ACPI code proper."
* 'x86-acpi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
X86 ACPI: Use #ifdef not #if for CONFIG_X86 check
ACPI: Fix build when disabled
ACPI: Document ACPI table overriding via initrd
ACPI: Create acpi_table_taint() function to avoid code duplication
ACPI: Implement physical address table override
ACPI: Store valid ACPI tables passed via early initrd in reserved memblock areas
x86, acpi: Introduce x86 arch specific arch_reserve_mem_area() for e820 handling
lib: Add early cpio decoder
Pull x86 RAS update from Ingo Molnar:
"Rework all config variables used throughout the MCA code and collect
them together into a mca_config struct. This keeps them tightly and
neatly packed together instead of spilled all over the place.
Then, convert those which are used as booleans into real booleans and
save some space. These bits are exposed via
/sys/devices/system/machinecheck/machinecheck*/"
* 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, MCA: Finish mca_config conversion
x86, MCA: Convert the next three variables batch
x86, MCA: Convert rip_msr, mce_bootlog, monarch_timeout
x86, MCA: Convert dont_log_ce, banks and tolerant
drivers/base: Add a DEVICE_BOOL_ATTR macro
This reverts commit de90cd71f6.
Shane Huang writes:
Please suspend this patch because I just received two new
DevSlp drives but found word 78 bit 5 is _not_ set.
I'm checking with the drive vendor whether he gave me
the wrong information. If bit 5 is not the necessary and
sufficient condition, I will implement another patch to
replace ata_device->sata_settings into ->devslp_timing.
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
- Two new drivers from Pali Rohár and N900 hackers: rx51_battery and
bq2415x_charger. The drivers are a part of a solution to replace the
proprietary Nokia BME stack;
- Power supply core now registers devices with a thermal cooling
subsystem, so we can now automatically throttle charging. Thanks to
Ramakrishna Pallala!
- Device tree support for ab8500 and max8925_power drivers;
- Random fixups and enhancements for a bunch of drivers.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABAgAGBQJQyFGdAAoJEGgI9fZJve1bYy8P/iu4XRL2kNwmPD7fMO+bZSao
xCSi8zRCnOtDutB48MQXIDTi1+I64+Qqq5xTuHW5n/T0zJ88biGBZfmJLjM88oQ4
ZqfO/axT6/MNFXTQtxEvXtn/AoM6lp9ZLmIbs7mkSpcnuG8iY5eovKfeyyNiAoGh
W6ISc9d1zZl8JYZXkkX4y5uxsH/t34rPjWJqygMulJrVdePpC9fNWwDYnul6T7iF
/bbqIH/j88HSYNDN518dv/ot+OfPn0vuxD4WzGAqC7ddHICtuaw8TBfXHmr6j4Mf
o9ytC/dH+AQ6Tcs+h7mbz/dfL4Gs9JK2LgRX2z1DtFa6uTw5jwEtZSuDF4DKovEg
71T67dsHDKJkQgGMDfDw1BrlLjQr8hkaWroF8CdmbRO0rRsfKamLQANG7mGkSu8B
l1W0ZzWEbHDteNB/wCkjoncXlcqoA8YiplyVSqf38aP8YuaeHMR3LRGLFwgXK3pB
lELh3cYSYLVpcqUgkRWcPrCUnLA2CkQy2B4INj5jFZilux7DzyIHMhWI2OuUs2rH
1rwTYO91gTVONuagZn8+lmsjs+Yq1n2BcyrGMG2+ZYzruFaQ+brQsCbQR0hdRuaU
zWLENqwGJ/++XrFgw3/5orMZIZQzL/yq+MbIXtrgKcFRWwoxyzJuH63RMQf3S6yy
hxi7pYQmb9if+HPJZVVg
=Bb5O
-----END PGP SIGNATURE-----
Merge tag 'for-v3.8-merged' of git://git.infradead.org/battery-2.6
Pull battery subsystem updates from Anton Vorontsov:
"Highlights:
- Two new drivers from Pali Rohár and N900 hackers: rx51_battery and
bq2415x_charger. The drivers are a part of a solution to replace
the proprietary Nokia BME stack
- Power supply core now registers devices with a thermal cooling
subsystem, so we can now automatically throttle charging. Thanks
to Ramakrishna Pallala!
- Device tree support for ab8500 and max8925_power drivers
- Random fixups and enhancements for a bunch of drivers."
* tag 'for-v3.8-merged' of git://git.infradead.org/battery-2.6: (22 commits)
max8925_power: Add support for device-tree initialization
ab8500: Add devicetree support for chargalg
ab8500: Add devicetree support for charger
ab8500: Add devicetree support for btemp
ab8500: Add devicetree support for fuelgauge
twl4030_charger: Change TWL4030_MODULE_* ids to TWL_MODULE_*
jz4740-battery: Use devm_request_and_ioremap
jz4740-battery: Use devm_kzalloc
bq27x00_battery: Fixup nominal available capacity reporting
bq2415x_charger: Fix style issues
bq2415x_charger: Add Kconfig/Makefile entries
power_supply: Add bq2415x charger driver
power_supply: Add new Nokia RX-51 (N900) power supply battery driver
max17042_battery: Fix missing verify_model_lock() return value check
ds2782_battery: Fix signedness bug in ds278x_read_reg16()
lp8788-charger: Fix ADC channel names
lp8788-charger: Fix wrong ADC conversion
lp8788-charger: Use consumer device name on setting IIO channels
power_supply: Register power supply for thermal cooling device
power_supply: Add support for CHARGE_CONTROL_* attributes
...
Pull media updates from Mauro Carvalho Chehab:
- Missing MAINTAINERS entries were added for several drivers
- Adds V4L2 support for DMABUF handling, allowing zero-copy buffer
sharing between V4L2 devices and GPU
- Got rid of all warnings when compiling with W=1 on x86
- Add a new driver for Exynos hardware (s3c-camif)
- Several bug fixes, cleanups and driver improvements
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (243 commits)
[media] omap3isp: Replace cpu_is_omap3630() with ISP revision check
[media] omap3isp: Prepare/unprepare clocks before/after enable/disable
[media] omap3isp: preview: Add support for 8-bit formats at the sink pad
[media] omap3isp: Replace printk with dev_*
[media] omap3isp: Find source pad from external entity
[media] omap3isp: Configure CSI-2 phy based on platform data
[media] omap3isp: Add PHY routing configuration
[media] omap3isp: Add CSI configuration registers from control block to ISP resources
[media] omap3isp: Remove unneeded module memory address definitions
[media] omap3isp: Use monotonic timestamps for statistics buffers
[media] uvcvideo: Fix control value clamping for unsigned integer controls
[media] uvcvideo: Mark first output terminal as default video node
[media] uvcvideo: Add VIDIOC_[GS]_PRIORITY support
[media] uvcvideo: Return -ENOTTY for unsupported ioctls
[media] uvcvideo: Set device_caps in VIDIOC_QUERYCAP
[media] uvcvideo: Don't fail when an unsupported format is requested
[media] uvcvideo: Return -EACCES when trying to access a read/write-only control
[media] uvcvideo: Set error_idx properly for extended controls API failures
[media] rtl28xxu: add NOXON DAB/DAB+ USB dongle rev 2
[media] fc2580: write some registers conditionally
...
This patch set includes two large new drivers: mpt3sas (for the next gen
fusion SAS hardware) and csiostor a FCoE offload driver for the Chelsio
converged network cards (this includes some net changes which I've OK'd with
DaveM).
The rest of the patch is driver updates (qla2xxx, lpfc, hptiop, be2iscsi) plus
a few assorted updates and bug fixes.
We also have a Power Management rework in the Upper Layer Drivers preparatory
to doing ACPI zero power optical devices, but the actual enabler is still
being worked on.
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQEcBAABAgAGBQJQyE9RAAoJEDeqqVYsXL0Mf2oIAL+B2R7hM4RhZrVI0dq1x+og
o/B4JOnDXC7gFJTJXRjejEuAqmJRN7O1mxFV9sEo/zRa++Sd9YVPwQlcCFdesw0a
xU8aCAcy9hLlTcDK2pwhKN6i/anyIvl1Qec/574y9UhFxUsQz+7G9IvT7UmBqaYt
zVTvd4zX4ZHRBIyMTNzkSLGUHcJzKeMOrTFekJwQNDQpHXPJknOCqNiokhLPv0ET
Cl1JZS/jlF7g4FcePhmYyL/nGHfXp1/WYKneDVT7PFWNJc2RPTBDP3PfdN4mBJfc
bQXl/vRLtAjYDpHUJ4IKJbdtfFLkm4KS9ET3kwTkpZ2K6U3c9NklmBK3or2Vo1I=
=Gw2O
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull first round of SCSI updates from James Bottomley:
"This patch set includes two large new drivers: mpt3sas (for the next
gen fusion SAS hardware) and csiostor a FCoE offload driver for the
Chelsio converged network cards (this includes some net changes which
I've OK'd with DaveM).
The rest of the patch is driver updates (qla2xxx, lpfc, hptiop,
be2iscsi) plus a few assorted updates and bug fixes.
We also have a Power Management rework in the Upper Layer Drivers
preparatory to doing ACPI zero power optical devices, but the actual
enabler is still being worked on.
Signed-off-by: James Bottomley <JBottomley@Parallels.com>"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (72 commits)
[SCSI] mpt3sas: add new driver supporting 12GB SAS
[SCSI] scsi_transport_sas: add 12GB definitions for mpt3sas
[SCSI] miscdevice: Adding support for MPT3SAS_MINOR(222)
[SCSI] csiostor: remove unneeded memset()
[SCSI] csiostor: Fix sparse warnings.
[SCSI] qla2xxx: Display that driver is operating in legacy interrupt mode.
[SCSI] qla2xxx: Dont clear drv active on iospace config failure.
[SCSI] qla2xxx: Fix typo in qla2xxx driver.
[SCSI] qla2xxx: Update ql2xextended_error_logging parameter description with new option.
[SCSI] qla2xxx: Parameterize the link speed of hba rather than fcport.
[SCSI] qla2xxx: Add 16Gb/s case to get port speed capability.
[SCSI] qla2xxx: Move marking fcport online ahead of setting iiDMA speed.
[SCSI] qla2xxx: Add acquiring of risc semaphore before doing ISP reset.
[SCSI] qla2xxx: Ignore driver ack bit if corresponding presence bit is not set.
[SCSI] qla2xxx: Fix typo in qla83xx_fw_dump function.
[SCSI] qla2xxx: Add Gen3 PCIe speed 8GT/s to the log message.
[SCSI] qla2xxx: Use correct Request-Q-Out register during bidirectional request processing
[SCSI] qla2xxx: Move noisy Start scsi failed messages to verbose logging level.
[SCSI] qla2xxx: Fix coccinelle warnings in qla2x00_relogin.
[SCSI] qla2xxx: No fcport FC-4 type assignment in GA_NXT response.
...
- A good chunk of Bart Van Assche's SRP fixes
- UAPI disintegration from David Howells
- mlx4 support for "64-byte CQE" hardware feature from Or Gerlitz
- Other miscellaneous fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABCAAGBQJQxstjAAoJEENa44ZhAt0hURUQAJd7HumReKTdRqzIzXPc+rgl
pRR5eqplPY2anfJMqLDiFphVjfCiKyhudomdo+RUbBFFnUVLlBzk80A0/IZ3g3PZ
MHOT+pX4PGDd+3FQxV2AaQCMwgGbvC0haInXyQDVZGm0fbMjRd699yGVWBiA8rOI
VNhUi5WMmynSINYokM8UxrhfoUfy3QxsOvZBZ3XUD1zjJB0IMd5HRdiDUG7ur0q+
rfpWKv51DXT81ux36MXbdPBhLRbzx4B7EwuPWOFPqJe1KwK2cD8iA6DwEKC9KMxS
Kj2+CxB5Bfpfz8bhLi2VZcMgAKiSIQDXUtiKz8h0yFVhvADYZLU7zdGN49mCqKcY
9dwX8+0aIVez6WB2jH+ir2FSG65NsnvqESwQ4LLQ9bhArgf9fapVGlypHwcKi5hh
3j2ipO/RyT56nLQeI0gz1P5mQneFSWlY96CD8WP+9OxO/mVnxViajzevSwT/cLE6
IOMks8DPhsQK88JXSx0XKVxn3zrJ9SXbYDhRWJ6f4w/fxraRXlFdQi0UfcsAajkX
5qmM4e8Oy97TJYiY1RkAmb7aV182xMWVjtDx2FFTQ5ukgDea/DklIM/JNQ475027
N7zMW1tP6+gnnDyMEkteVuPdbl1fzwI3RdXCh0mFZHZ5tvegkdxbw0XxERcevnQN
LZfME8wCuC7+RtmE38Li
=TQK2
-----END PGP SIGNATURE-----
Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull infiniband upate from Roland Dreier:
"First batch of InfiniBand/RDMA changes for the 3.8 merge window:
- A good chunk of Bart Van Assche's SRP fixes
- UAPI disintegration from David Howells
- mlx4 support for "64-byte CQE" hardware feature from Or Gerlitz
- Other miscellaneous fixes"
Fix up trivial conflict in mellanox/mlx4 driver.
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (33 commits)
RDMA/nes: Fix for crash when registering zero length MR for CQ
RDMA/nes: Fix for terminate timer crash
RDMA/nes: Fix for BUG_ON due to adding already-pending timer
IB/srp: Allow SRP disconnect through sysfs
srp_transport: Document sysfs attributes
srp_transport: Simplify attribute initialization code
srp_transport: Fix attribute registration
IB/srp: Document sysfs attributes
IB/srp: send disconnect request without waiting for CM timewait exit
IB/srp: destroy and recreate QP and CQs when reconnecting
IB/srp: Eliminate state SRP_TARGET_DEAD
IB/srp: Introduce the helper function srp_remove_target()
IB/srp: Suppress superfluous error messages
IB/srp: Process all error completions
IB/srp: Introduce srp_handle_qp_err()
IB/srp: Simplify SCSI error handling
IB/srp: Keep processing commands during host removal
IB/srp: Eliminate state SRP_TARGET_CONNECTING
IB/srp: Increase block layer timeout
RDMA/cm: Change return value from find_gid_port()
...
Primarily SPI device driver bug fixes, one removal of an old driver, and
some new tegra support. There is some core code change too, but all in
all pretty small stuff. The new features to note are:
- Common code for describing GPIO CS lines in the device tree
- Remove the SPI_BUFSIZ limitation on spi_write_the_read()
- core spi ensures bits_per_word is set correctly
- SPARC can now use SPI
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQyg7oAAoJEEFnBt12D9kBdeIP/j29x9sdtN/jXOXi5kdpnuuo
SnksHB8pvP8OUq12Zx6X8rT7jrpwQLLEz1WwZcrTxZv6/aVm3MiUTLqqgwM++4Ve
MbNWuLN525ifvpD2421EIClzpSz7NjgsjKQoGK6+w8X03YVPJrbBYKpxvs4JWt/g
IEPuJT0J5J3g6cFDk41SN0DoXKpZijwIWcJ9hCDZimwYXxxs6QloD3Z7hoGVyGp9
7zP3CB4muXM9BDXnAy6GK9bRz6m31soWKDto0bmH93Axfk+2IEAmxW6dE7uhGFfA
o4IWMAWeZ5YOiAOYzANJTkdOqHE36TfdaEZGG2wqTUKbnsQwCgnWVp8vqNPJMoi/
1o/Zh2wBMcxuUFEL6h/fqBsLvAH3jKOaTf+dzBsz6+eK5FHxvI6BvsdvfdlTvPsv
K0hKkGW0F55sYEty4dPr7toea0tCBsAm4tZmjj93afH824U3p/DYL8Av8SYn3vaP
xiImsS4bxu75kLXquHFypclbEAVHc5iFqWATY8CGZrECsX/euzh6IW4CcfbZg+wi
8v309vBkjs0K/1xz034GpK+sGnmVBAO99WZUWQmJW0pwfBGJflvRJQ1Rw49ZAFAw
60I3jKN84ZbvuMKvhaoCYSIgv0euMeb1To4acGEVfgbpdVH5YoohtL1kZw/1x+Ob
YD+qukZS/4PEJlr6VzLA
=bvpH
-----END PGP SIGNATURE-----
Merge tag 'spi-for-linus' of git://git.secretlab.ca/git/linux-2.6
Pull SPI updates from Grant Likely:
"Primarily SPI device driver bug fixes, one removal of an old driver,
and some new tegra support. There is some core code change too, but
all in all pretty small stuff.
The new features to note are:
- Common code for describing GPIO CS lines in the device tree
- Remove the SPI_BUFSIZ limitation on spi_write_the_read()
- core spi ensures bits_per_word is set correctly
- SPARC can now use SPI"
* tag 'spi-for-linus' of git://git.secretlab.ca/git/linux-2.6: (36 commits)
spi/sparc: Allow of_register_spi_devices for sparc
spi: Remove HOTPLUG section attributes
spi: Add support for specifying 3-wire mode via device tree
spi: Fix comparison of different integer types
spi/orion: Add SPI_CHPA and SPI_CPOL support to kirkwood driver.
spi/sh: Add SH Mobile series as dependency to MSIOF controller
spi/sh-msiof: Remove unneeded clock name
spi: Remove SPI_BUFSIZ restriction on spi_write_then_read()
spi/stmp: remove obsolete driver
spi/clps711x: New SPI master driver
spi: omap2-mcspi: remove duplicate inclusion of linux/err.h
spi: omap2-mcspi: Fix the redifine warning
spi/sh-hspi: add CS manual control support
of_spi: add generic binding support to specify cs gpio
spi: omap2-mcspi: remove duplicated include from spi-omap2-mcspi.c
spi/bitbang: (cosmetic) simplify list manipulation
spi/bitbang: avoid needless loop flow manipulations
spi/omap: fix D0/D1 direction confusion
spi: tegra: add spi driver for sflash controller
spi: Dont call master->setup if not populated
...
Define a constant to hold the marker value seen in an indefinite-length
element.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Jan Beulich points out __COUNTER__ (gcc 4.3 and above), so let's use
that to create unique ids. This is better than __LINE__ which we use
today, so provide a wrapper.
Stanislaw Gruszka <sgruszka@redhat.com> reported that some module parameters
start with a digit, so we need to prepend when we for the unique id.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Jan Beulich <jbeulich@suse.com>
With the addition of the new kernel module syscall, which defines two
arguments - a file descriptor to the kernel module and a pointer to a NULL
terminated string of module arguments - it is now possible to measure and
appraise kernel modules like any other file on the file system.
This patch adds support to measure and appraise kernel modules in an
extensible and consistent manner.
To support filesystems without extended attribute support, additional
patches could pass the signature as the first parameter.
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now that kernel module origins can be reasoned about, provide a hook to
the LSMs to make policy decisions about the module file. This will let
Chrome OS enforce that loadable kernel modules can only come from its
read-only hash-verified root filesystem. Other LSMs can, for example,
read extended attributes for signatures, etc.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Thanks to Michael Kerrisk for keeping us honest. These flags are actually
useful for eliminating the only case where kmod has to mangle a module's
internals: for overriding module versioning.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Acked-by: Kees Cook <keescook@chromium.org>
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
Pull KVM updates from Marcelo Tosatti:
"Considerable KVM/PPC work, x86 kvmclock vsyscall support,
IA32_TSC_ADJUST MSR emulation, amongst others."
Fix up trivial conflict in kernel/sched/core.c due to cross-cpu
migration notifier added next to rq migration call-back.
* tag 'kvm-3.8-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (156 commits)
KVM: emulator: fix real mode segment checks in address linearization
VMX: remove unneeded enable_unrestricted_guest check
KVM: VMX: fix DPL during entry to protected mode
x86/kexec: crash_vmclear_local_vmcss needs __rcu
kvm: Fix irqfd resampler list walk
KVM: VMX: provide the vmclear function and a bitmap to support VMCLEAR in kdump
x86/kexec: VMCLEAR VMCSs loaded on all cpus if necessary
KVM: MMU: optimize for set_spte
KVM: PPC: booke: Get/set guest EPCR register using ONE_REG interface
KVM: PPC: bookehv: Add EPCR support in mtspr/mfspr emulation
KVM: PPC: bookehv: Add guest computation mode for irq delivery
KVM: PPC: Make EPCR a valid field for booke64 and bookehv
KVM: PPC: booke: Extend MAS2 EPN mask for 64-bit
KVM: PPC: e500: Mask MAS2 EPN high 32-bits in 32/64 tlbwe emulation
KVM: PPC: Mask ea's high 32-bits in 32/64 instr emulation
KVM: PPC: e500: Add emulation helper for getting instruction ea
KVM: PPC: bookehv64: Add support for interrupt handling
KVM: PPC: bookehv: Remove GET_VCPU macro from exception handler
KVM: PPC: booke: Fix get_tb() compile error on 64-bit
KVM: PPC: e500: Silence bogus GCC warning in tlb code
...
Pull s390 update from Martin Schwidefsky:
"Add support to generate code for the latest machine zEC12, MOD and XOR
instruction support for the BPF jit compiler, the dasd safe offline
feature and the big one: the s390 architecture gets PCI support!!
Right before the world ends on the 21st ;-)"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (41 commits)
s390/qdio: rename the misleading PCI flag of qdio devices
s390/pci: remove obsolete email addresses
s390/pci: speed up __iowrite64_copy by using pci store block insn
s390/pci: enable NEED_DMA_MAP_STATE
s390/pci: no msleep in potential IRQ context
s390/pci: fix potential NULL pointer dereference in dma_free_seg_table()
s390/pci: use kmem_cache_zalloc instead of kmem_cache_alloc/memset
s390/bpf,jit: add support for XOR instruction
s390/bpf,jit: add support MOD instruction
s390/cio: fix pgid reserved check
vga: compile fix, disable vga for s390
s390/pci: add PCI Kconfig options
s390/pci: s390 specific PCI sysfs attributes
s390/pci: PCI hotplug support via SCLP
s390/pci: CHSC PCI support for error and availability events
s390/pci: DMA support
s390/pci: PCI adapter interrupts for MSI/MSI-X
s390/bitops: find leftmost bit instruction support
s390/pci: CLP interface
s390/pci: base support
...
Merge misc VM changes from Andrew Morton:
"The rest of most-of-MM. The other MM bits await a slab merge.
This patch includes the addition of a huge zero_page. Not a
performance boost but it an save large amounts of physical memory in
some situations.
Also a bunch of Fujitsu engineers are working on memory hotplug.
Which, as it turns out, was badly broken. About half of their patches
are included here; the remainder are 3.8 material."
However, this merge disables CONFIG_MOVABLE_NODE, which was totally
broken. We don't add new features with "default y", nor do we add
Kconfig questions that are incomprehensible to most people without any
help text. Does the feature even make sense without compaction or
memory hotplug?
* akpm: (54 commits)
mm/bootmem.c: remove unused wrapper function reserve_bootmem_generic()
mm/memory.c: remove unused code from do_wp_page()
asm-generic, mm: pgtable: consolidate zero page helpers
mm/hugetlb.c: fix warning on freeing hwpoisoned hugepage
hwpoison, hugetlbfs: fix RSS-counter warning
hwpoison, hugetlbfs: fix "bad pmd" warning in unmapping hwpoisoned hugepage
mm: protect against concurrent vma expansion
memcg: do not check for mm in __mem_cgroup_count_vm_event
tmpfs: support SEEK_DATA and SEEK_HOLE (reprise)
mm: provide more accurate estimation of pages occupied by memmap
fs/buffer.c: remove redundant initialization in alloc_page_buffers()
fs/buffer.c: do not inline exported function
writeback: fix a typo in comment
mm: introduce new field "managed_pages" to struct zone
mm, oom: remove statically defined arch functions of same name
mm, oom: remove redundant sleep in pagefault oom handler
mm, oom: cleanup pagefault oom handler
memory_hotplug: allow online/offline memory to result movable node
numa: add CONFIG_MOVABLE_NODE for movable-dedicated node
mm, memcg: avoid unnecessary function call when memcg is disabled
...
Host bridge hotplug:
- Untangle _PRT from struct pci_bus (Bjorn Helgaas)
- Request _OSC control before scanning root bus (Taku Izumi)
- Assign resources when adding host bridge (Yinghai Lu)
- Remove root bus when removing host bridge (Yinghai Lu)
- Remove _PRT during hot remove (Yinghai Lu)
SRIOV
- Add sysfs knobs to control numVFs (Don Dutile)
Power management
- Notify devices when power resource turned on (Huang Ying)
Bug fixes
- Work around broken _SEG on HP xw9300 (Bjorn Helgaas)
- Keep runtime PM enabled for unbound PCI devices (Huang Ying)
- Fix Optimus dual-GPU runtime D3 suspend issue (Dave Airlie)
- Fix xen frontend shutdown issue (David Vrabel)
- Work around PLX PCI 9050 BAR alignment erratum (Ian Abbott)
Miscellaneous
- Add GPL license for drivers/pci/ioapic (Andrew Cooks)
- Add standard PCI-X, PCIe ASPM register #defines (Bjorn Helgaas)
- NumaChip remote PCI support (Daniel Blueman)
- Fix PCIe Link Capabilities Supported Link Speed definition (Jingoo Han)
- Convert dev_printk() to dev_info(), etc (Joe Perches)
- Add support for non PCI BAR ROM data (Matthew Garrett)
- Add x86 support for host bridge translation offset (Mike Yoknis)
- Report success only when every driver supports AER (Vijay Pandarathil)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQIcBAABAgAGBQJQyKwSAAoJEPGMOI97Hn6zScgQAJZK2VDfCv74mKrgSDNokIzH
5nVDrc9AHKJm7CUODs6keJK5d4TD/za3Zao68zrYHsJJKes2ni2Z3W34HP2RXKK2
eOmePXOHYPPZMlimP9r9cVxNu1ZJCyp/yWSBcsPF4zUgWhBWLRaSj85I049gQ0sz
+05nZYfLjVd3HNiaXsG4CQyMrNF46XEsLhF9vs+Nr2GHPwrpzhfScgYv63oDS86C
3ICKsjmiRUZcNelxIFYmyxa5u89QdW5XHjzc9eHGQuus24Vxw+TZzsdfc17sUJEE
HTyXY+RjDpOVhdtwwUjrCEOiyZYvy3g9+3sKxoxgt/76ghdUaR7fxITwB97qVMFD
T0ESlKjSV/Qv5QYdyy5uP4zwNs/PXCWXkTg/L1m71F30BxKWDa7tgiA6uK7Z7fl5
1aokKBdk3mtJJJIDJG1YkxPXx/JItTGCNYrx7CcFj49rSjrUWLQdmrYahersRIsB
3wiD2xTi9e4dXeP/+VGzGOWB/sHk+73jvrvZe/REa1FCnMINDz4+9V9WaGROMqyq
MQ8kX0KfYcNVNxy1GOXjU5wLpMN/t/QbvI7gwzRP1DAUCJPoOgFy7AjvSTVG3zuy
8CtdOFttVkUn5dqsbQR0gVbyQVTS3PGSKz5XC/s8kVDWhja0xZTBYwrskM/4zdSD
Xf48OyYV5EjpC3FYUSiU
=OE3Q
-----END PGP SIGNATURE-----
Merge tag 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI update from Bjorn Helgaas:
"Host bridge hotplug:
- Untangle _PRT from struct pci_bus (Bjorn Helgaas)
- Request _OSC control before scanning root bus (Taku Izumi)
- Assign resources when adding host bridge (Yinghai Lu)
- Remove root bus when removing host bridge (Yinghai Lu)
- Remove _PRT during hot remove (Yinghai Lu)
SRIOV
- Add sysfs knobs to control numVFs (Don Dutile)
Power management
- Notify devices when power resource turned on (Huang Ying)
Bug fixes
- Work around broken _SEG on HP xw9300 (Bjorn Helgaas)
- Keep runtime PM enabled for unbound PCI devices (Huang Ying)
- Fix Optimus dual-GPU runtime D3 suspend issue (Dave Airlie)
- Fix xen frontend shutdown issue (David Vrabel)
- Work around PLX PCI 9050 BAR alignment erratum (Ian Abbott)
Miscellaneous
- Add GPL license for drivers/pci/ioapic (Andrew Cooks)
- Add standard PCI-X, PCIe ASPM register #defines (Bjorn Helgaas)
- NumaChip remote PCI support (Daniel Blueman)
- Fix PCIe Link Capabilities Supported Link Speed definition (Jingoo
Han)
- Convert dev_printk() to dev_info(), etc (Joe Perches)
- Add support for non PCI BAR ROM data (Matthew Garrett)
- Add x86 support for host bridge translation offset (Mike Yoknis)
- Report success only when every driver supports AER (Vijay
Pandarathil)"
Fix up trivial conflicts.
* tag 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (48 commits)
PCI: Use phys_addr_t for physical ROM address
x86/PCI: Add NumaChip remote PCI support
ath9k: Use standard #defines for PCIe Capability ASPM fields
iwlwifi: Use standard #defines for PCIe Capability ASPM fields
iwlwifi: collapse wrapper for pcie_capability_read_word()
iwlegacy: Use standard #defines for PCIe Capability ASPM fields
iwlegacy: collapse wrapper for pcie_capability_read_word()
cxgb3: Use standard #defines for PCIe Capability ASPM fields
PCI: Add standard PCIe Capability Link ASPM field names
PCI/portdrv: Use PCI Express Capability accessors
PCI: Use standard PCIe Capability Link register field names
x86: Use PCI setup data
PCI: Add support for non-BAR ROMs
PCI: Add pcibios_add_device
EFI: Stash ROMs if they're not in the PCI BAR
PCI: Add and use standard PCI-X Capability register names
PCI/PM: Keep runtime PM enabled for unbound PCI devices
xen-pcifront: Handle backend CLOSED without CLOSING
PCI: SRIOV control and status via sysfs (documentation)
PCI/AER: Report success only when every device has AER-aware driver
...
A fairly quiet release again, a couple of relatively small new features
and a bunch of driver specific work including yet more code elimination
and fixes from Axel Lin.
- Addidion of linear_min_sel for offsetting linear selectors in the
helpers.
- Support for continuous voltage ranges for regulators with extremely
high resolution.
- Drivers for AS3711, DA9055, MAX9873, TPS51632, TPS80031 and ARM vexpress.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJQxyvhAAoJELSic+t+oim9F90P+QEQl/gNYmusoGbUmmgnybft
hHmxpJXKhUSMSm7Lju3FLBhA7+gGnm1baWhOBGWMKM5iawlpHSUcognO2LRwRRiT
d54RS6+J5vUR6AE/w4EEaIBPcgg6kAXpwe9Q22EAfJTPTndTjSuKu8HikKkMw89H
RX964WKPfJDBrNBxfcJMWyQUR+89q3BcXk3oKWK5RTGJ+TvjvPnNd66/or5x+a7A
H9xov93MMWBiS59MxUeAO7DTIOZg2p3TIPHlUFW3hoobbOe0EWdvqkpDJYW75ss9
53B/pO/wsct6yVu86zD8MGo0Uzu7NE3zZoR2q0EflPHV+A5U6DiOS+jW6hk+0v/B
gUhreHiCNZ8veHzLujAuuESGeZG8W5qtavMI2em3eeIkGNogBMiAjwZqNfOWce/+
G3wVdpKhdGFXa4zmlD5yfOeXpjXb3agSP8i0hG/pVkDuok6KU/BnzTOi+m6uagdF
GfJJI4fgoa0s02H/AicVMY2NOFBeQWPuyGTFA73tsUa0uhSPYlf0nZdIkmDYN3QX
r8az6j0p3JRpvQEcADezdzcSGU6ulx5UljZn089kFDulIumjRM4fcbPWIeednvYg
P3+CUmpE/xktPuohTg/si/KiJiUS/gIw2FHGrGp3TpcmoXFgcI9uKu/F2OGCZfGT
t2ZqLsA6vn06Cvweml2/
=2d65
-----END PGP SIGNATURE-----
Merge tag 'regulator-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator updates from Mark Brown:
"A fairly quiet release again, a couple of relatively small new
features and a bunch of driver specific work including yet more code
elimination and fixes from Axel Lin.
- Addidion of linear_min_sel for offsetting linear selectors in the
helpers.
- Support for continuous voltage ranges for regulators with extremely
high resolution.
- Drivers for AS3711, DA9055, MAX9873, TPS51632, TPS80031 and ARM
vexpress."
Fix up trivial conflict (due to typo fix) in palmas-regulator.c
* tag 'regulator-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (80 commits)
regulator: core: Fix logic to determinate if regulator can change voltage
regulator: s5m8767: Fix to work even if no DVS gpio present
regulator: s5m8767: Fix to read the first DVS register.
regulator: s5m8767: Fix to work when platform registers less regulators
regulator: gpio-regulator: gpio_set_value should use cansleep
regulator: gpio-regulator: Fix logical error in for() loop
regulator: anatop: Use regulator_[get|set]_voltage_sel_regmap
regulator: anatop: Use linear_min_sel with linear mapping
regulator: max1586: Implement get_voltage_sel callback
regulator: lp8788-buck: Kill _gpio_request function
regulator: tps80031: Convert tps80031_ldo_ops to linear_min_sel and list_voltage_linear
regulator: lp8788-ldo: Remove val array in lp8788_config_ldo_enable_mode
regulator: gpio-regulator: Add ifdef CONFIG_OF guard for regulator_gpio_of_match
regulator: palmas: Convert palmas_ops_smps to regulator_[get|set]_voltage_sel_regmap
regulator: palmas: Return raw register values as the selectors in [get|set]_voltage_sel
regulators: add regulator_can_change_voltage() function
regulator: tps51632: Ensure [base|max]_voltage_uV pdata settings are valid
regulator: wm831x-dcdc: Add MODULE_ALIAS for wm831x-boostp
regulator: wm831x-dcdc: Ensure selected voltage falls within requested range
regulator: tps51632: Use linear_min_sel and regulator_[map|list]_voltage_linear
...
Pull HID subsystem updates from Jiri Kosina:
1) Support for HID over I2C bus has been added by Benjamin Tissoires.
ACPI device discovery is still in the works.
2) Support for Win8 Multitiouch protocol is being added, most work done
by Benjamin Tissoires as well
3) EIO/ERESTARTSYS is fixed in hiddev/hidraw, fixes by Andrew Duggan
and Jiri Kosina
4) ION iCade driver added by Bastien Nocera
5) Support for a couple new Roccat devices has been added by Stefan
Achatz
6) HID sensor hubs are now auto-detected instead of having to list all
the VID/PID combinations in the blacklist array
7) other random fixes and support for new device IDs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (65 commits)
HID: i2c-hid: add mutex protecting open/close race
Revert "HID: sensors: add to special driver list"
HID: sensors: autodetect USB HID sensor hubs
HID: hidp: fallback to input session properly if hid is blacklisted
HID: i2c-hid: fix ret_count check
HID: i2c-hid: fix i2c_hid_get_raw_report count mismatches
HID: i2c-hid: remove extra .irq field in struct i2c_hid
HID: i2c-hid: reorder allocation/free of buffers
HID: i2c-hid: fix memory corruption due to missing hid declaration
HID: i2c-hid: remove superfluous include
HID: i2c-hid: remove unneeded test in i2c_hid_remove
HID: i2c-hid: i2c_hid_get_report may fail
HID: i2c-hid: also call i2c_hid_free_buffers in i2c_hid_remove
HID: i2c-hid: fix error messages
HID: i2c-hid: fix return paths
HID: i2c-hid: remove unused static declarations
HID: i2c-hid: fix i2c_hid_dbg macro
HID: i2c-hid: fix checkpatch.pl warning
HID: i2c-hid: enhance Kconfig
HID: i2c-hid: change I2C name
...
Pull trivial branch from Jiri Kosina:
"Usual stuff -- comment/printk typo fixes, documentation updates, dead
code elimination."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
HOWTO: fix double words typo
x86 mtrr: fix comment typo in mtrr_bp_init
propagate name change to comments in kernel source
doc: Update the name of profiling based on sysfs
treewide: Fix typos in various drivers
treewide: Fix typos in various Kconfig
wireless: mwifiex: Fix typo in wireless/mwifiex driver
messages: i2o: Fix typo in messages/i2o
scripts/kernel-doc: check that non-void fcts describe their return value
Kernel-doc: Convention: Use a "Return" section to describe return values
radeon: Fix typo and copy/paste error in comments
doc: Remove unnecessary declarations from Documentation/accounting/getdelays.c
various: Fix spelling of "asynchronous" in comments.
Fix misspellings of "whether" in comments.
eisa: Fix spelling of "asynchronous".
various: Fix spelling of "registered" in comments.
doc: fix quite a few typos within Documentation
target: iscsi: fix comment typos in target/iscsi drivers
treewide: fix typo of "suport" in various comments and Kconfig
treewide: fix typo of "suppport" in various comments
...
This update contains a fairly wide range of changes all over in sound
subdirectory, mainly because of UAPI header moves by David and __dev*
annotation removals by Bill. Other highlights are:
- Introduced the support for wallclock timestamps in ALSA PCM core
- Add the poll loop implementation for HD-audio jack detection
- Yet more VGA-switcheroo fixes for HD-audio
- New VIA HD-audio codec support
- More fixes on resource management in USB audio and MIDI drivers
- More quirks for USB-audio ASUS Xonar U3, Reloop Play, Focusrite,
Roland VG-99, etc
- Add support for FastTrack C400 usb-audio
- Clean ups in many drivers regarding firmware loading
- Add PSC724 Ultiimate Edge support to ice1712
- A few hdspm driver updates
- New Stanton SCS.1d/1m FireWire driver
- Standardisation of the logging in ASoC codes
- DT and dmaengine support for ASoC Atmel
- Support for Wolfson ADSP cores
- New drivers for Freescale/iVeia P1022 and Maxim MAX98090
- Lots of other ASoC driver fixes and developments
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABAgAGBQJQyYh9AAoJEGwxgFQ9KSmk95UP+wcVXRynWulCVwEnVjjWv/UM
2fIsD0RkWO4WoqEVCNCHzW27pvFC36uqLx5vtas4CoJ2CX8ufQsPqWTx6K3MY/DQ
/VJT8CI2EKPQY1Q0PySVo2nBqwcUXWWoZE47t1ZowR6cxtmgrS8lLxYJvkbfEOHF
PS8WItACqH5F5VPxGFkhPwR2OTLfaYt7ilKx82vgxSzdMAel8q/I9uS/MUv8KV+1
1s5yBvlW5eyDXKpQbP/KiMJLJ/zk1MRAKK2HftAk8pNt+Xy160NvXhbILDWKC4uT
QBRPqyIe8BmL3VlXpw3mn7nbeU7rSfc/Rbrnm2+Yb54/Wtj4PnV6Z6bg7HO610im
e/tBNHoaL7qn1iNYSC3heW11rToksd03/LK0GREvkX3Bl21T+Naaun/DY/PGIfMQ
nOXy7uVo6sVdZnN2MTWof9qeMYBSlrwQVsMde6kbYadNcXnuqUqCOx41kiAdE8t5
jfZy5QR+uDjdJgDv8/CuiCYxSjjTUfO1tdSV/VjsTK17Gfw3DWTJpeznVnoIALbZ
81HXYxb+uGZ8xDr0WXGkydlnPvB6bnWu2vvfrLBkioA/p60EDNCtgKo1y3NzQGh8
P2qILNhXIRrS2EoAScQOb6F1RPsvbDN9PbihnkShX3r7DXyeynr9jgPy3L2vf0bR
xyu0jtQiqYb34ss7qqdz
=IIsz
-----END PGP SIGNATURE-----
Merge tag 'sound-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"This update contains a fairly wide range of changes all over in sound
subdirectory, mainly because of UAPI header moves by David and __dev*
annotation removals by Bill. Other highlights are:
- Introduced the support for wallclock timestamps in ALSA PCM core
- Add the poll loop implementation for HD-audio jack detection
- Yet more VGA-switcheroo fixes for HD-audio
- New VIA HD-audio codec support
- More fixes on resource management in USB audio and MIDI drivers
- More quirks for USB-audio ASUS Xonar U3, Reloop Play, Focusrite,
Roland VG-99, etc
- Add support for FastTrack C400 usb-audio
- Clean ups in many drivers regarding firmware loading
- Add PSC724 Ultiimate Edge support to ice1712
- A few hdspm driver updates
- New Stanton SCS.1d/1m FireWire driver
- Standardisation of the logging in ASoC codes
- DT and dmaengine support for ASoC Atmel
- Support for Wolfson ADSP cores
- New drivers for Freescale/iVeia P1022 and Maxim MAX98090
- Lots of other ASoC driver fixes and developments"
Fix up trivial conflicts. And go out on a limb and assume the dts file
'status' field of one of the conflicting things was supposed to be
"disabled", not "disable" like in pretty much all other cases.
* tag 'sound-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (341 commits)
ALSA: hda - Move runtime PM check to runtime_idle callback
ALSA: hda - Add stereo-dmic fixup for Acer Aspire One 522
ALSA: hda - Avoid doubly suspend after vga switcheroo
ALSA: usb-audio: Enable S/PDIF on the ASUS Xonar U3
ALSA: hda - Check validity of CORB/RIRB WP reads
ALSA: hda - use usleep_range in link reset and change timeout check
ALSA: HDA: VIA: Add support for codec VT1808.
ALSA: HDA: VIA Add support for codec VT1705CF.
ASoC: codecs: remove __dev* attributes
ASoC: utils: remove __dev* attributes
ASoC: ux500: remove __dev* attributes
ASoC: txx9: remove __dev* attributes
ASoC: tegra: remove __dev* attributes
ASoC: spear: remove __dev* attributes
ASoC: sh: remove __dev* attributes
ASoC: s6000: remove __dev* attributes
ASoC: OMAP: remove __dev* attributes
ASoC: nuc900: remove __dev* attributes
ASoC: mxs: remove __dev* attributes
ASoC: kirkwood: remove __dev* attributes
...
This branch contains a largeish set of updates of power management and
clock setup. The bulk of it is for OMAP/AM33xx platforms, but also a
few around hotplug/suspend/resume on Exynos.
It includes a split-up of some of the OMAP clock data into separate
files which adds to the diffstat, but gross delta is fairly reasonable.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQySrdAAoJEIwa5zzehBx3XBMQAJCgrE/I1vbuiENiVMLpQkN7
0Y5t89SwoALnme5jsM8vW/H0o6b163FfH+249UNs6dyk6MdCqqSv8cQVvSfCKRKn
m4JTXS6njSnYgU025klNU9W3FyMuFcgH8b63+c5+sCcqfvdkxrjpNoKqAYNxQlgJ
6DHgHCKZP3JB1e2EFqViw0GJgZIjTNxlWvx8XLcAShSapQ+Ovq/iAcKo41o7ZadB
4zV//VkBMv0TLNTs6SoR0EO8qM+2nIUKE8fgPrRxvvb98tuasKPvlSq9VUeIjnYk
kWjNTQYbD1IkrAMPYrwcpLU3xkEr14wrKDBtPK6GdCDcXzdyKq3fzROOXNsqrLmn
Y8PkY+J39B59F0DKjyrCjasZyi0tQQV6ps5Xm2X2CB003GWEbo1yu+BodShYEyaL
7OvkLiVpOtq+LOYcsScHtOGdO9le2O3yZevNRvlrCDCvJUYbXNjaM9ZrWcQuTlJc
oseYNSRPaIP5PMv2c+Xup9qh/dyAjj8g6gSi9BlDvwVOu/+Cf3laaCAXaFph22jT
/m8fW2paiP5UJ0gSt1MLAGFxKi0YI/Qxck8G11LJxUwMZd3SIfOPUAY27tnNC4yL
GRynOi3BFRMUAe/Leu2qgwEyoME5nHn+OmAKb36WTYH/HL+PGzHq84e+Wfdan8ha
YcSKc1A52LSoYTi4XTUX
=h2Qo
-----END PGP SIGNATURE-----
Merge tag 'pm-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC power management and clock changes from Olof Johansson:
"This branch contains a largeish set of updates of power management and
clock setup. The bulk of it is for OMAP/AM33xx platforms, but also a
few around hotplug/suspend/resume on Exynos.
It includes a split-up of some of the OMAP clock data into separate
files which adds to the diffstat, but gross delta is fairly reasonable."
* tag 'pm-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (60 commits)
ARM: OMAP: Move plat-omap/dma-omap.h to include/linux/omap-dma.h
ASoC: OMAP: mcbsp fixes for enabling ARM multiplatform support
watchdog: OMAP: fixup for ARM multiplatform support
ARM: EXYNOS: Add flush_cache_all in suspend finisher
ARM: EXYNOS: Remove scu_enable from cpuidle
ARM: EXYNOS: Fix soft reboot hang after suspend/resume
ARM: EXYNOS: Add support for rtc wakeup
ARM: EXYNOS: fix the hotplug for Cortex-A15
ARM: OMAP2+: omap_device: Correct resource handling for DT boot
ARM: OMAP2+: hwmod: Add possibility to count hwmod resources based on type
ARM: OMAP2+: hwmod: Add support for per hwmod/module context lost count
ARM: OMAP2+: PRM: initialize some PRM functions early
ARM: OMAP2+: voltage: fixup oscillator handling when CONFIG_PM=n
ARM: OMAP4: USB: power down MUSB PHY during boot
ARM: OMAP2+: clock: Cleanup !CONFIG_COMMON_CLK parts
ARM: OMAP2xxx: clock: drop obsolete clock data
ARM: OMAP2: clock: Cleanup !CONFIG_COMMON_CLK parts
ARM: OMAP3+: DPLL: drop !CONFIG_COMMON_CLK sections
ARM: AM33xx: clock: drop obsolete clock data
ARM: OMAP3xxx: clk: drop obsolete clock data
...
Here are more patches in the progression towards multiplatform, sparse
irq conversions in particular.
Tegra has a handful of cleanups and general groundwork, but is
not quite there yet on full enablement.
Platforms that are enabled through this branch are VT8500 and Zynq. note
that i.MX was converted in one of the earlier cleanup branches as
well (before we started a separate topic for multiplatform). And both
new platforms for this merge window, sunxi and bcm, were merged with
multiplatform support enabled.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQySb/AAoJEIwa5zzehBx3Wo4P/0GrpUhB/qwuhgy43MA2I1Dv
tnyuFvsfW9uRExcw2IwT39GFls98QUM9TwQxPqOTHVf+u0LkYMZ9aDeWJOdj3RvG
H70Ypj4gZDrzZAFr2TUf8NnYGHd6G2EcMn3261Hjfd7YrswCjsMPvgRns7VOyHCa
deif3KcLu3+HzxvuzqlVlTuSAagCQbfqqnTQduMRdJPHT3X3sXwl7ABW+qfOoeYC
rjqIbjdh5dB1d/f7igtgBbXjSTnVz/Mr1+wk4rp9Xr1Wv0IXvIaSKjK2Df8ZuNAk
aQ6mMy/oDVxlDSrYv0F7lB40/rsZcPqz8+fgYJ2FnvCpIM7z7NeTWD2kQJ2UaQ/s
VunShloRxF8It6104EVWZDfEA9NvVBcCALSze0NukqiHZRZYGUzxRNQDrncaksC9
Lm+Z16cUWogsZq7VDCgXYQJeakPQfBDnsx7siMvAbOgvtpSClxuwhdC/czJiix7h
BcpA+l5xSviUhHvzHhDt9iJxHjbUmo1xLDvaZSgj2OjAj257JcwaNBCk5BjZTCwe
xZmQu1FjwaGtjLiG6QY0WJRsq1hiFRIb/MaWar/WpfqADFqARoambGFUjOl+P4Mu
DIM5Z0AS04H+pLuP1QOz/yXxOPEP6Ri36to6XrgzfL/XGet5LW2P59xXxhcWC/OL
/3IAcQrsAqh4aGMOstW1
=UJlh
-----END PGP SIGNATURE-----
Merge tag 'multiplatform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC multiplatform conversion patches from Olof Johansson:
"Here are more patches in the progression towards multiplatform, sparse
irq conversions in particular.
Tegra has a handful of cleanups and general groundwork, but is not
quite there yet on full enablement.
Platforms that are enabled through this branch are VT8500 and Zynq.
Note that i.MX was converted in one of the earlier cleanup branches as
well (before we started a separate topic for multiplatform). And both
new platforms for this merge window, sunxi and bcm, were merged with
multiplatform support enabled."
Fix up conflicts mostly as per Olof.
* tag 'multiplatform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (29 commits)
ARM: zynq: Remove all unused mach headers
ARM: zynq: add support for ARCH_MULTIPLATFORM
ARM: zynq: make use of debug_ll_io_init()
ARM: zynq: remove TTC early mapping
ARM: tegra: move debug-macro.S to include/debug
ARM: tegra: don't include iomap.h from debug-macro.S
ARM: tegra: decouple uncompress.h and debug-macro.S
ARM: tegra: simplify DEBUG_LL UART selection options
ARM: tegra: select SPARSE_IRQ
ARM: tegra: enhance timer.c to get IO address from device tree
ARM: tegra: enhance timer.c to get IRQ info from device tree
ARM: timer: fix checkpatch warnings
ARM: tegra: add TWD to device tree
ARM: tegra: define DT bindings for and instantiate RTC
ARM: tegra: define DT bindings for and instantiate timer
clocksource/mtu-nomadik: use apb_pclk
clk: ux500: Register mtu apb_pclocks
ARM: plat-nomadik: convert platforms to SPARSE_IRQ
mfd/db8500-prcmu: use the irq_domain_add_simple()
mfd/ab8500-core: use irq_domain_add_simple()
...
Continued device tree conversion and enablement across a number of
platforms; Kirkwood, tegra, i.MX, Exynos, zynq and a couple of other
smaller series as well.
ux500 has seen continued conversion for platforms. Several platforms have
seen pinctrl-via-devicetree conversions for simpler multiplatform. Tegra
is adding data for new devices/drivers, and Exynos has a bunch of new
bindings and devices added as well.
So, pretty much the same progression in the right direction as the last
few releases.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQySW7AAoJEIwa5zzehBx39xcP/jzEQOTOJdK4zJd1OjgrQoX/
WnhbGJT941RNjRjvDG6HmZzhpsRoE4q/zkjFEKoKELdikRW0hYoR+zPCGuB7XtN5
aF1ZQrTx4gHf4KE7doIB8slaWeOq8aG2TLFhylyy+cuaIpRK0NG0pAR0ZqWaoga9
tZFciqzplLeo50vZ+y+lVVsR40j/w29EjwPXhCV30//gGOYLyp/VDu5PRtrBdgh8
EgpcT2EWJwMCN/Upcao/q2JbQktPHPpSwnpaUAALYB20uD7k5jo7wtYE/+L9nn6B
bxcCDTMVmqzNTF+y0P16hDcs5jMLVjpI0xBiyZ1G6gShpggsSZCHY5ynjAtQ19se
r+2WrNfOR23k6arJuOUAQSEnLdx0T5SlW6CJeFEofKv4uoebxAbKUiNO4ShWskhd
nNptX1+L3hj3zpjGcEHmL6bd+nGtyMeoG9Yekcv1oZxdVcpKhFxh0s5PEJBEeXcN
M7aAWlWJkplV22Olqhpc/3INCweq6E+zBrBxZaUBW/JCzGrqBUGC0BULDPAkmC4J
CKL6IqIB73jGQ4OY14IaMU20GJrIGxZ7wzXOp4aw3OUpRlxsgurfyFQeIjUvVoZL
PJ8DRoAVwreVHvKfgZZVKpSAY7dwcWbxpWsYlrH3zWIC5vRJ0UFwsD0TpLJWd6Vi
XA8gQcJRWKGS8E5mRY39
=Rk9v
-----END PGP SIGNATURE-----
Merge tag 'dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC device tree conversions and enablement from Olof Johansson:
"Continued device tree conversion and enablement across a number of
platforms; Kirkwood, tegra, i.MX, Exynos, zynq and a couple of other
smaller series as well.
ux500 has seen continued conversion for platforms. Several platforms
have seen pinctrl-via-devicetree conversions for simpler
multiplatform. Tegra is adding data for new devices/drivers, and
Exynos has a bunch of new bindings and devices added as well.
So, pretty much the same progression in the right direction as the
last few releases."
Fix up conflicts as per Olof.
* tag 'dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (185 commits)
ARM: ux500: Rename dbx500 cpufreq code to be more generic
ARM: dts: add missing ux500 device trees
ARM: ux500: Stop registering the PCM driver from platform code
ARM: ux500: Move board specific GPIO info out to subordinate DTS files
ARM: ux500: Disable the MMCI gpio-regulator by default
ARM: Kirkwood: remove kirkwood_ehci_init() from new boards
ARM: Kirkwood: Add support LED of OpenBlocks A6
ARM: Kirkwood: Convert to EHCI via DT for OpenBlocks A6
ARM: kirkwood: Add NAND partiton map for OpenBlocks A6
ARM: kirkwood: Add support second I2C bus and RTC on OpenBlocks A6
ARM: kirkwood: Add support DT of second I2C bus
ARM: kirkwood: Convert mplcec4 board to pinctrl
ARM: Kirkwood: Convert km_kirkwood to pinctrl
ARM: Kirkwood: support 98DX412x kirkwoods with pinctrl
ARM: Kirkwood: Convert IX2-200 to pinctrl.
ARM: Kirkwood: Convert lsxl boards to pinctrl.
ARM: Kirkwood: Convert ib62x0 to pinctrl.
ARM: Kirkwood: Convert GoFlex Net to pinctrl.
ARM: Kirkwood: Convert dreamplug to pinctrl.
ARM: Kirkwood: Convert dockstar to pinctrl.
...
This would reset a connection with any OSD that had an outstanding
request that was taking more than N seconds. The idea was that if the
OSD was buggy, the client could compensate by resending the request.
In reality, this only served to hide server bugs, and we haven't
actually seen such a bug in quite a while. Moreover, the userspace
client code never did this.
More importantly, often the request is taking a long time because the
OSD is trying to recover, or overloaded, and killing the connection
and retrying would only make the situation worse by giving the OSD
more work to do.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Add AVX2 optimized gen_syndrom functions, which is simply based on
sse2.c written by hpa.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Jim Kukunas <james.t.kukunas@linux.intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Optimize RAID6 recovery functions to take advantage of
the 256-bit YMM integer instructions introduced in AVX2.
The patch was tested and benchmarked before submission.
However hardware is not yet released so benchmark numbers
cannot be reported.
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Jim Kukunas <james.t.kukunas@linux.intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Pull networking changes from David Miller:
1) Allow to dump, monitor, and change the bridge multicast database
using netlink. From Cong Wang.
2) RFC 5961 TCP blind data injection attack mitigation, from Eric
Dumazet.
3) Networking user namespace support from Eric W. Biederman.
4) tuntap/virtio-net multiqueue support by Jason Wang.
5) Support for checksum offload of encapsulated packets (basically,
tunneled traffic can still be checksummed by HW). From Joseph
Gasparakis.
6) Allow BPF filter access to VLAN tags, from Eric Dumazet and
Daniel Borkmann.
7) Bridge port parameters over netlink and BPDU blocking support
from Stephen Hemminger.
8) Improve data access patterns during inet socket demux by rearranging
socket layout, from Eric Dumazet.
9) TIPC protocol updates and cleanups from Ying Xue, Paul Gortmaker, and
Jon Maloy.
10) Update TCP socket hash sizing to be more in line with current day
realities. The existing heurstics were choosen a decade ago.
From Eric Dumazet.
11) Fix races, queue bloat, and excessive wakeups in ATM and
associated drivers, from Krzysztof Mazur and David Woodhouse.
12) Support DOVE (Distributed Overlay Virtual Ethernet) extensions
in VXLAN driver, from David Stevens.
13) Add "oops_only" mode to netconsole, from Amerigo Wang.
14) Support set and query of VEB/VEPA bridge mode via PF_BRIDGE, also
allow DCB netlink to work on namespaces other than the initial
namespace. From John Fastabend.
15) Support PTP in the Tigon3 driver, from Matt Carlson.
16) tun/vhost zero copy fixes and improvements, plus turn it on
by default, from Michael S. Tsirkin.
17) Support per-association statistics in SCTP, from Michele
Baldessari.
And many, many, driver updates, cleanups, and improvements. Too
numerous to mention individually.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits)
net/mlx4_en: Add support for destination MAC in steering rules
net/mlx4_en: Use generic etherdevice.h functions.
net: ethtool: Add destination MAC address to flow steering API
bridge: add support of adding and deleting mdb entries
bridge: notify mdb changes via netlink
ndisc: Unexport ndisc_{build,send}_skb().
uapi: add missing netconf.h to export list
pkt_sched: avoid requeues if possible
solos-pci: fix double-free of TX skb in DMA mode
bnx2: Fix accidental reversions.
bna: Driver Version Updated to 3.1.2.1
bna: Firmware update
bna: Add RX State
bna: Rx Page Based Allocation
bna: TX Intr Coalescing Fix
bna: Tx and Rx Optimizations
bna: Code Cleanup and Enhancements
ath9k: check pdata variable before dereferencing it
ath5k: RX timestamp is reported at end of frame
ath9k_htc: RX timestamp is reported at end of frame
...
reserve_bootmem_generic() has no caller,
Signed-off-by: Lin Feng <linfeng@cn.fujitsu.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We have two different implementation of is_zero_pfn() and my_zero_pfn()
helpers: for architectures with and without zero page coloring.
Let's consolidate them in <asm-generic/pgtable.h>.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently a zone's present_pages is calcuated as below, which is
inaccurate and may cause trouble to memory hotplug.
spanned_pages - absent_pages - memmap_pages - dma_reserve.
During fixing bugs caused by inaccurate zone->present_pages, we found
zone->present_pages has been abused. The field zone->present_pages may
have different meanings in different contexts:
1) pages existing in a zone.
2) pages managed by the buddy system.
For more discussions about the issue, please refer to:
http://lkml.org/lkml/2012/11/5/866https://patchwork.kernel.org/patch/1346751/
This patchset tries to introduce a new field named "managed_pages" to
struct zone, which counts "pages managed by the buddy system". And revert
zone->present_pages to count "physical pages existing in a zone", which
also keep in consistence with pgdat->node_present_pages.
We will set an initial value for zone->managed_pages in function
free_area_init_core() and will adjust it later if the initial value is
inaccurate.
For DMA/normal zones, the initial value is set to:
(spanned_pages - absent_pages - memmap_pages - dma_reserve)
Later zone->managed_pages will be adjusted to the accurate value when the
bootmem allocator frees all free pages to the buddy system in function
free_all_bootmem_node() and free_all_bootmem().
The bootmem allocator doesn't touch highmem pages, so highmem zones'
managed_pages is set to the accurate value "spanned_pages - absent_pages"
in function free_area_init_core() and won't be updated anymore.
This patch also adds a new field "managed_pages" to /proc/zoneinfo
and sysrq showmem.
[akpm@linux-foundation.org: small comment tweaks]
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
Tested-by: Chris Clayton <chris2553@googlemail.com>
Cc: "Rafael J . Wysocki" <rjw@sisk.pl>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Minchan Kim <minchan@kernel.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We need a node which only contains movable memory. This feature is very
important for node hotplug. If a node has normal/highmem, the memory may
be used by the kernel and can't be offlined. If the node only contains
movable memory, we can offline the memory and the node.
All are prepared, we can actually introduce N_MEMORY.
add CONFIG_MOVABLE_NODE make we can use it for movable-dedicated node
[akpm@linux-foundation.org: fix Kconfig text]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Tested-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While profiling numa/core v16 with cgroup_disable=memory on the command
line, I noticed mem_cgroup_count_vm_event() still showed up as high as
0.60% in perftop.
This occurs because the function is called extremely often even when memcg
is disabled.
To fix this, inline the check for mem_cgroup_disabled() so we avoid the
unnecessary function call if memcg is disabled.
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Glauber Costa <glommer@parallels.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.
The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Lin Feng <linfeng@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We have N_NORMAL_MEMORY for standing for the nodes that have normal memory
with zone_type <= ZONE_NORMAL.
And we have N_HIGH_MEMORY for standing for the nodes that have normal or
high memory.
But we don't have any word to stand for the nodes that have *any* memory.
And we have N_CPU but without N_MEMORY.
Current code reuse the N_HIGH_MEMORY for this purpose because any node
which has memory must have high memory or normal memory currently.
A) But this reusing is bad for *readability*. Because the name
N_HIGH_MEMORY just stands for high or normal:
A.example 1)
mem_cgroup_nr_lru_pages():
for_each_node_state(nid, N_HIGH_MEMORY)
The user will be confused(why this function just counts for high or
normal memory node? does it counts for ZONE_MOVABLE's lru pages?)
until someone else tell them N_HIGH_MEMORY is reused to stand for
nodes that have any memory.
A.cont) If we introduce N_MEMORY, we can reduce this confusing
AND make the code more clearly:
A.example 2) mm/page_cgroup.c use N_HIGH_MEMORY twice:
One is in page_cgroup_init(void):
for_each_node_state(nid, N_HIGH_MEMORY) {
It means if the node have memory, we will allocate page_cgroup map for
the node. We should use N_MEMORY instead here to gaim more clearly.
The second using is in alloc_page_cgroup():
if (node_state(nid, N_HIGH_MEMORY))
addr = vzalloc_node(size, nid);
It means if the node has high or normal memory that can be allocated
from kernel. We should keep N_HIGH_MEMORY here, and it will be better
if the "any memory" semantic of N_HIGH_MEMORY is removed.
B) This reusing is out-dated if we introduce MOVABLE-dedicated node.
The MOVABLE-dedicated node should not appear in
node_stats[N_HIGH_MEMORY] nor node_stats[N_NORMAL_MEMORY],
because MOVABLE-dedicated node has no high or normal memory.
In x86_64, N_HIGH_MEMORY=N_NORMAL_MEMORY, if a MOVABLE-dedicated node
is in node_stats[N_HIGH_MEMORY], it is also means it is in
node_stats[N_NORMAL_MEMORY], it causes SLUB wrong.
The slub uses
for_each_node_state(nid, N_NORMAL_MEMORY)
and creates kmem_cache_node for MOVABLE-dedicated node and cause problem.
In one word, we need a N_MEMORY. We just intrude it as an alias to
N_HIGH_MEMORY and fix all im-proper usages of N_HIGH_MEMORY in late
patches.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Lin Feng <linfeng@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
By default kernel tries to use huge zero page on read page fault. It's
possible to disable huge zero page by writing 0 or enable it back by
writing 1:
echo 0 >/sys/kernel/mm/transparent_hugepage/khugepaged/use_zero_page
echo 1 >/sys/kernel/mm/transparent_hugepage/khugepaged/use_zero_page
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
hzp_alloc is incremented every time a huge zero page is successfully
allocated. It includes allocations which where dropped due
race with other allocation. Note, it doesn't count every map
of the huge zero page, only its allocation.
hzp_alloc_failed is incremented if kernel fails to allocate huge zero
page and falls back to using small pages.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pass vma instead of mm and add address parameter.
In most cases we already have vma on the stack. We provides
split_huge_page_pmd_mm() for few cases when we have mm, but not vma.
This change is preparation to huge zero pmd splitting implementation.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On write access to huge zero page we alloc a new huge page and clear it.
If ENOMEM, graceful fallback: we create a new pmd table and set pte around
fault address to newly allocated normal (4k) page. All other ptes in the
pmd set to normal zero page.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull big execve/kernel_thread/fork unification series from Al Viro:
"All architectures are converted to new model. Quite a bit of that
stuff is actually shared with architecture trees; in such cases it's
literally shared branch pulled by both, not a cherry-pick.
A lot of ugliness and black magic is gone (-3KLoC total in this one):
- kernel_thread()/kernel_execve()/sys_execve() redesign.
We don't do syscalls from kernel anymore for either kernel_thread()
or kernel_execve():
kernel_thread() is essentially clone(2) with callback run before we
return to userland, the callbacks either never return or do
successful do_execve() before returning.
kernel_execve() is a wrapper for do_execve() - it doesn't need to
do transition to user mode anymore.
As a result kernel_thread() and kernel_execve() are
arch-independent now - they live in kernel/fork.c and fs/exec.c
resp. sys_execve() is also in fs/exec.c and it's completely
architecture-independent.
- daemonize() is gone, along with its parts in fs/*.c
- struct pt_regs * is no longer passed to do_fork/copy_process/
copy_thread/do_execve/search_binary_handler/->load_binary/do_coredump.
- sys_fork()/sys_vfork()/sys_clone() unified; some architectures
still need wrappers (ones with callee-saved registers not saved in
pt_regs on syscall entry), but the main part of those suckers is in
kernel/fork.c now."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (113 commits)
do_coredump(): get rid of pt_regs argument
print_fatal_signal(): get rid of pt_regs argument
ptrace_signal(): get rid of unused arguments
get rid of ptrace_signal_deliver() arguments
new helper: signal_pt_regs()
unify default ptrace_signal_deliver
flagday: kill pt_regs argument of do_fork()
death to idle_regs()
don't pass regs to copy_process()
flagday: don't pass regs to copy_thread()
bfin: switch to generic vfork, get rid of pointless wrappers
xtensa: switch to generic clone()
openrisc: switch to use of generic fork and clone
unicore32: switch to generic clone(2)
score: switch to generic fork/vfork/clone
c6x: sanitize copy_thread(), get rid of clone(2) wrapper, switch to generic clone()
take sys_fork/sys_vfork/sys_clone prototypes to linux/syscalls.h
mn10300: switch to generic fork/vfork/clone
h8300: switch to generic fork/vfork/clone
tile: switch to generic clone()
...
Conflicts:
arch/microblaze/include/asm/Kbuild
This contains the bulk of new SoC development for this merge window.
Two new platforms have been added, the sunxi platforms (Allwinner A1x
SoCs) by Maxime Ripard, and a generic Broadcom platform for a new
series of ARMv7 platforms from them, where the hope is that we can
keep the platform code generic enough to have them all share one mach
directory. The new Broadcom platform is contributed by Christian Daudt.
Highbank has grown support for Calxeda's next generation of hardware,
ECX-2000.
clps711x has seen a lot of cleanup from Alexander Shiyan, and he's also
taken on maintainership of the platform.
Beyond this there has been a bunch of work from a number of people on
converting more platforms to IRQ domains, pinctrl conversion, cleanup
and general feature enablement across most of the active platforms.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQyLCjAAoJEIwa5zzehBx3AdQP/R+L3+EQMjiEWt/p7g/ql5Em
0SnP92CcGzrjgLTg9z1FeOazfOsGnkZAYUlDRkqfKobH3VqkhYFFtt1/0x0KMahm
xcowHgMBOyimFdWT9vLK3J8U6DLui5XrEG9LGH2VL+lqmfjIyP/OOF3mVc0/+pV9
WTLAsYswdBRSeiNuF43kqlfrOwF6xsPLgiNMlc82w6BzHqoHu6dOif5M9MqWaApS
V74DPmwLD371Tyit6aHqt3JOqpgiPSHlmxkzomK+5idcW3Pa7HnzzFYmx85dk/eN
J2siqIkoOu7tEfjIbNZTL2MYoX4tUUKv4qZZ3IOl3YSWaV3P5ilMApF01XVrkk8E
DWOMhzte9hC7L90W+/kCPLF1VyeAhCem2KQWUitO71fKur3r+3ZaUokNVvWzkJIL
7aduxAJOV2hfLgEqbjbjF3o4S8p63OV3kzivFJM1And15zDJo4+qqOh67+bPo4jj
+R4du+SqzXriw4i3tDLGVpdjDffk4D41tbLzgkWAtvGyoP45yeYfHAzAh0pDFPRv
ASfZVmZ5PhwAUAkIMnpC2sjgmxMYff3SYqmDgnsqXES7rbDH/hG+teymtHFTyUQp
m+f60DNotSMcMvkLdvruLSB4aeTiwbfOqPn/g+aXYUlPuNMq1fVWgN7EJKWkamK4
nRwaJmLwx1/ojcVbpy2G
=YMKB
-----END PGP SIGNATURE-----
Merge tag 'soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC updates from Olof Johansson:
"This contains the bulk of new SoC development for this merge window.
Two new platforms have been added, the sunxi platforms (Allwinner A1x
SoCs) by Maxime Ripard, and a generic Broadcom platform for a new
series of ARMv7 platforms from them, where the hope is that we can
keep the platform code generic enough to have them all share one mach
directory. The new Broadcom platform is contributed by Christian
Daudt.
Highbank has grown support for Calxeda's next generation of hardware,
ECX-2000.
clps711x has seen a lot of cleanup from Alexander Shiyan, and he's
also taken on maintainership of the platform.
Beyond this there has been a bunch of work from a number of people on
converting more platforms to IRQ domains, pinctrl conversion, cleanup
and general feature enablement across most of the active platforms."
Fix up trivial conflicts as per Olof.
* tag 'soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (174 commits)
mfd: vexpress-sysreg: Remove LEDs code
irqchip: irq-sunxi: Add terminating entry for sunxi_irq_dt_ids
clocksource: sunxi_timer: Add terminating entry for sunxi_timer_dt_ids
irq: versatile: delete dangling variable
ARM: sunxi: add missing include for mdelay()
ARM: EXYNOS: Avoid early use of of_machine_is_compatible()
ARM: dts: add node for PL330 MDMA1 controller for exynos4
ARM: EXYNOS: Add support for secondary CPU bring-up on Exynos4412
ARM: EXYNOS: add UART3 to DEBUG_LL ports
ARM: S3C24XX: Add clkdev entry for camif-upll clock
ARM: SAMSUNG: Add s3c24xx/s3c64xx CAMIF GPIO setup helpers
ARM: sunxi: Add missing sun4i.dtsi file
pinctrl: samsung: Do not initialise statics to 0
ARM i.MX6: remove gate_mask from pllv3
ARM i.MX6: Fix ethernet PLL clocks
ARM i.MX6: rename PLLs according to datasheet
ARM i.MX6: Add pwm support
ARM i.MX51: Add pwm support
ARM i.MX53: Add pwm support
ARM: mx5: Replace clk_register_clkdev with clock DT lookup
...
Cleanup patches for various ARM platforms and some of their associated
drivers. There's also a branch in here that enables Freescale i.MX to be
part of the multiplatform support -- the first "big" SoC that is moved
over (more multiplatform work comes in a separate branch later during
the merge window).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQx2p9AAoJEIwa5zzehBx3aPUQAIjV3VDf/ACkA4KUQu0BFg5U
57OIkl6RCZvfKhYgq5+6OJ2AK6VkGh9PqTmXkDS7Nj3QMS/uWcb3U419aPJsd3Z/
vNGpTl+J/YcAcFrKMqTyNv98TAiAOJlpm70CqmRbkhpMfoJb7//1JKqGTJPBO+tj
8ZEwNGC0WbRNOSQTY/TTAhbZE1sqXwKy9mDLGmcwqKBY8H1TFHyPB6yWYFSxMHxS
JAegbYhYO9FawOOLoi9ovT+2vUR9vDu0xxV4zUK9f5DqKcCb/wYuN0QkusjnEutm
RfIt7iXHHzi35YPxtlrGgSz9EIYXKAafSzkgf3Ydpjci5DH/vbVexm/CT+V+SwOT
SvucYJMALI/aOEFJWN/50L6B9zipSrWb51tK7WFXz/sUCrMQrXH3Mu99mjHZXSoL
1cylsvs3DFQC7vHFLSjRpX6eJdfE+Hb0LZ878eXSbDVCOnU8odAQrofugqfmeVDk
eN0+BWmchJgvljOiKVUQMC3PCquCaAAO1lm/HU7bWPlVigTuHSW0uisDyCYAtlt1
dGxnbbhoFJvSH7CMOoMO7hIFnoNJEe6+uVUuwA/+iJouMXMJLoY6Da4L72h1Mp81
o4Hr6Kxly/SMtURZ/6pCycx5ahb5TaahstYAoe7Qp1dMj5U2m6fUVfKkG7tUx1CW
MIuvN3qJeW2qKWKmZRVM
=zfPd
-----END PGP SIGNATURE-----
Merge tag 'cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC cleanups on various subarchitectures from Olof Johansson:
"Cleanup patches for various ARM platforms and some of their associated
drivers. There's also a branch in here that enables Freescale i.MX to
be part of the multiplatform support -- the first "big" SoC that is
moved over (more multiplatform work comes in a separate branch later
during the merge window)."
Conflicts fixed as per Olof, including a silent semantic one in
arch/arm/mach-omap2/board-generic.c (omap_prcm_restart() was renamed to
omap3xxx_restart(), and a new user of the old name was added).
* tag 'cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (189 commits)
ARM: omap: fix typo on timer cleanup
ARM: EXYNOS: Remove unused regs-mem.h file
ARM: EXYNOS: Remove unused non-dt support for dwmci controller
ARM: Kirkwood: Use hw_pci.ops instead of hw_pci.scan
ARM: OMAP3: cm-t3517: use GPTIMER for system clock
ARM: OMAP2+: timer: remove CONFIG_OMAP_32K_TIMER
ARM: SAMSUNG: use devm_ functions for ADC driver
ARM: EXYNOS: no duplicate mask/unmask in eint0_15
ARM: S3C24XX: SPI clock channel setup is fixed for S3C2443
ARM: EXYNOS: Remove i2c0 resource information and setting of device names
ARM: Kirkwood: checkpatch cleanups
ARM: Kirkwood: Fix sparse warnings.
ARM: Kirkwood: Remove unused includes
ARM: kirkwood: cleanup lsxl board includes
ARM: integrator: use BUG_ON where possible
ARM: integrator: push down SC dependencies
ARM: integrator: delete static UART1 mapping
ARM: integrator: delete SC mapping on the CP
ARM: integrator: remove static CP syscon mapping
ARM: integrator: remove static AP syscon mapping
...
This is a collection of header file cleanups, mostly for OMAP and AT91,
that keeps moving the platforms in the direction of multiplatform by
removing the need for mach-dependent header files used in drivers and
other places.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQx2reAAoJEIwa5zzehBx3Z/oQAIHp8aXbUD0GvbBPQ2vIydZ8
Qfdh2ypbyFB9GSLmPP6BhurEFUcwmW3MH2r9Kq68omPt4XxrFEt1SdEn3TgZlhTY
YYzEv2DLbWfmKCDBFAEFullsfVkrpggyou5ty30YVp/u2lVLvChpkciE5/9HADy5
uoh/W2KZcLE51tvnbRy/1DBaOtxEdCjJ9hTaef7FCB9ugOG+yMDYwIPRW/b4fNA6
o/1NuZTWp0H4ePRG2oqLxYnjSiF63DVTJZjSJ+c5gW34bWgh8+/xzaLFA9U++/ig
meGUD1Oe65+ctr+Wk2mV29eb02jauUe6qvkwt+iFIDDopgc2mn5BcW+ENpgpjhe3
6l3ClRd94qh4cMWzqDa5PZCdetshiKqSH2IGpKS/dHNB1h5sBP7CCbvbalwzWd5n
qjfkPC7kSFyU3XV+9SEAXE6HLKsiMQy9kRcKOMdTE5BA4FEAZk0wfiG9WcVCvS2D
9tDC3X0aScyXx4Mkd4wIAZVhY68QuI17fPft9zZSj691YiUW5cR2yF1qbCka0yd5
pHu/QSZg1+dUitMUvMPQmWJ15jnAtEYUtV/8zQFFVchgkxlrW+UtvW3zxghB6D+J
ZwcjAMfQQz9fLoMQXlVovjQIhYsjw3SlV62ReBH7eyPs3+KfPIBn7Ks6mVQs3dS5
AMUWORsB5u92XHl9EL9C
=aggA
-----END PGP SIGNATURE-----
Merge tag 'headers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC Header cleanups from Olof Johansson:
"This is a collection of header file cleanups, mostly for OMAP and
AT91, that keeps moving the platforms in the direction of
multiplatform by removing the need for mach-dependent header files
used in drivers and other places."
Fix up mostly trivial conflicts as per Olof.
* tag 'headers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (106 commits)
ARM: OMAP2+: Move iommu/iovmm headers to platform_data
ARM: OMAP2+: Make some definitions local
ARM: OMAP2+: Move iommu2 to drivers/iommu/omap-iommu2.c
ARM: OMAP2+: Move plat/iovmm.h to include/linux/omap-iommu.h
ARM: OMAP2+: Move iopgtable header to drivers/iommu/
ARM: OMAP: Merge iommu2.h into iommu.h
atmel: move ATMEL_MAX_UART to platform_data/atmel.h
ARM: OMAP: Remove omap_init_consistent_dma_size()
arm: at91: move at91rm9200 rtc header in drivers/rtc
arm: at91: move reset controller header to arm/arm/mach-at91
arm: at91: move pit define to the driver
arm: at91: move at91_shdwc.h to arch/arm/mach-at91
arm: at91: move board header to arch/arm/mach-at91
arn: at91: move at91_tc.h to arch/arm/mach-at91
arm: at91 move at91_aic.h to arch/arm/mach-at91
arm: at91 move board.h to arch/arm/mach-at91
arm: at91: move platfarm_data to include/linux/platform_data/atmel.h
arm: at91: drop machine defconfig
ARM: OMAP: Remove NEED_MACH_GPIO_H
ARM: OMAP: Remove unnecessary mach and plat includes
...
Pull ARM updates from Russell King:
"Here's the updates for ARM for this merge window, which cover quite a
variety of areas.
There's a bunch of patch series from Will tackling various bugs like
the PROT_NONE handling, ASID allocation, cluster boot protocol and
ASID TLB tagging updates.
We move to a build-time sorted exception table rather than doing the
sorting at run-time, add support for the secure computing filter, and
some updates to the perf code. We also have sorted out the placement
of some headers, fixed some build warnings, fixed some hotplug
problems with the per-cpu TWD code."
* 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (73 commits)
ARM: 7594/1: Add .smp entry for REALVIEW_EB
ARM: 7599/1: head: Remove boot-time HYP mode check for v5 and below
ARM: 7598/1: net: bpf_jit_32: fix sp-relative load/stores offsets.
ARM: 7595/1: syscall: rework ordering in syscall_trace_exit
ARM: 7596/1: mmci: replace readsl/writesl with ioread32_rep/iowrite32_rep
ARM: 7597/1: net: bpf_jit_32: fix kzalloc gfp/size mismatch.
ARM: 7593/1: nommu: do not enable DCACHE_WORD_ACCESS when !CONFIG_MMU
ARM: 7592/1: nommu: prevent generation of kernel unaligned memory accesses
ARM: 7591/1: nommu: Enable the strict alignment (CR_A) bit only if ARCH < v6
ARM: 7590/1: /proc/interrupts: limit the display of IPIs to online CPUs only
ARM: 7587/1: implement optimized percpu variable access
ARM: 7589/1: integrator: pass the lm resource to amba
ARM: 7588/1: amba: create a resource parent registrator
ARM: 7582/2: rename kvm_seq to vmalloc_seq so to avoid confusion with KVM
ARM: 7585/1: kernel: fix nr_cpu_ids check in DT logical map init
ARM: 7584/1: perf: fix link error when CONFIG_HW_PERF_EVENTS is not selected
ARM: gic: use a private mapping for CPU target interfaces
ARM: kernel: add logical mappings look-up
ARM: kernel: add cpu logical map DT init in setup_arch
ARM: kernel: add device tree init map function
...
Pull cgroup changes from Tejun Heo:
"A lot of activities on cgroup side. The big changes are focused on
making cgroup hierarchy handling saner.
- cgroup_rmdir() had peculiar semantics - it allowed cgroup
destruction to be vetoed by individual controllers and tried to
drain refcnt synchronously. The vetoing never worked properly and
caused good deal of contortions in cgroup. memcg was the last
reamining user. Michal Hocko removed the usage and cgroup_rmdir()
path has been simplified significantly. This was done in a
separate branch so that the memcg people can base further memcg
changes on top.
- The above allowed cleaning up cgroup lifecycle management and
implementation of generic cgroup iterators which are used to
improve hierarchy support.
- cgroup_freezer updated to allow migration in and out of a frozen
cgroup and handle hierarchy. If a cgroup is frozen, all descendant
cgroups are frozen.
- netcls_cgroup and netprio_cgroup updated to handle hierarchy
properly.
- Various fixes and cleanups.
- Two merge commits. One to pull in memcg and rmdir cleanups (needed
to build iterators). The other pulled in cgroup/for-3.7-fixes for
device_cgroup fixes so that further device_cgroup patches can be
stacked on top."
Fixed up a trivial conflict in mm/memcontrol.c as per Tejun (due to
commit bea8c150a7 ("memcg: fix hotplugged memory zone oops") in master
touching code close to commit 2ef37d3fe4 ("memcg: Simplify
mem_cgroup_force_empty_list error handling") in for-3.8)
* 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (65 commits)
cgroup: update Documentation/cgroups/00-INDEX
cgroup_rm_file: don't delete the uncreated files
cgroup: remove subsystem files when remounting cgroup
cgroup: use cgroup_addrm_files() in cgroup_clear_directory()
cgroup: warn about broken hierarchies only after css_online
cgroup: list_del_init() on removed events
cgroup: fix lockdep warning for event_control
cgroup: move list add after list head initilization
netprio_cgroup: allow nesting and inherit config on cgroup creation
netprio_cgroup: implement netprio[_set]_prio() helpers
netprio_cgroup: use cgroup->id instead of cgroup_netprio_state->prioidx
netprio_cgroup: reimplement priomap expansion
netprio_cgroup: shorten variable names in extend_netdev_table()
netprio_cgroup: simplify write_priomap()
netcls_cgroup: move config inheritance to ->css_online() and remove .broken_hierarchy marking
cgroup: remove obsolete guarantee from cgroup_task_migrate.
cgroup: add cgroup->id
cgroup, cpuset: remove cgroup_subsys->post_clone()
cgroup: s/CGRP_CLONE_CHILDREN/CGRP_CPUSET_CLONE_CHILDREN/
cgroup: rename ->create/post_create/pre_destroy/destroy() to ->css_alloc/online/offline/free()
...
Pull thermal management update from Zhang Rui:
"Highlights:
- Introduction of thermal policy support, together with three new
thermal governors, including step_wise, user_space, fire_share.
- Introduction of ST-Ericsson db8500_thermal driver and ST-Ericsson
db8500_cpufreq_cooling driver.
- Thermal Kconfig file and Makefile refactor.
- Fixes for generic thermal layer, generic cpucooling, rcar thermal
driver and Exynos thermal driver."
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (36 commits)
Thermal: Fix DEFAULT_THERMAL_GOVERNOR
Thermal: fix a NULL pointer dereference when generic thermal layer is built as a module
thermal: rcar: add rcar_zone_to_priv() macro
thermal: rcar: fixup the unit of temperature
thermal: cpu cooling: allow module builds
thermal: cpu cooling: use const parameter while registering
Thermal: Add ST-Ericsson DB8500 thermal properties and platform data.
Thermal: Add ST-Ericsson DB8500 thermal driver.
drivers/thermal/Makefile refactor
Exynos: Add missing dependency
Refactor drivers/thermal/Kconfig
thermal: cpu_cooling: Make 'notify_device' static
Thermal: Remove the cooling_cpufreq_list.
Thermal: fix bug of counting cpu frequencies.
Thermal: add indent for code alignment.
thermal: rcar_thermal: remove explicitly used devm_kfree/iounap()
thermal: user_space: Add missing static storage class specifiers
thermal: fair_share: Add missing static storage class specifiers
thermal: step_wise: Add missing static storage class specifiers
Thermal: Fix oops and unlocking in thermal_sys.c
...
Quite a few enhancements this time around, helpers and diagnostics for
the most part which is good to see:
- Addition of table based lookups for the register access checks from
Davide Ciminaghi, making life easier for drivers with big blocks of
similar registers.
- Allow drivers to get the irqdomain for regmap irq_chips, allowing the
domain to be used with other APIs.
- Debug improvements for paged register maps.
- Performance improvments for some of the diagnostic infrastructure,
very helpful for devices with large register maps.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJQxy2mAAoJELSic+t+oim9s+QP/0sXqx5MtY5C6xMhY64P3OOB
A5zCHHl12eXFc9D8MnGzhS87AfVFsqMe7I/Opdq4uajqUXUe4qQcYYAo0p2lFWWo
nD1R4K9TU2X8d4eF3S38anH8ADBR38tRwYeSzsbZqsLYuolbs92/TJJZKqsQjxdo
TipVEEqWXHntc43fOhrvVqfmD+2n/9vVKBBC5vtx6IW2o1JSfcHoKe7+qnOeRUsc
f+UWOhPM6NrNQnICzBrRQ6xTFT55TpfSyas7tHiMVaoWGe+p5aqsXbudzOjFlfny
vEnRv2h5CJuqZdrv7g/vxV7i2KayOJq9rFS/1jQ4xcIOCIlNN14RBONKR7VuicGv
I6GjXQPRjgIPnVxMhzujn25zZ2bFDpx1AZLrp/I1PZi3582/ZfF9twFxJ4T8OFOH
uZESk/jMxaOZzEN0h/TNgcrmJnbpGh1SCBUihLDJi/eSglGtPkd7GtqutL/kHm01
boWWv3nwAztNrh2HDQkTUi4+bqNtxh3ZTOPFHBu31SLcATUlEDCQ8zuwhfCPp60U
jP9VYx1tMtK3sLg7wUKLZTCYQF9ezaquBlAOcMINbRisN2dByj/ZAfCMSTZ3ZJqQ
iVoThJR3llKfWzS7X9PdUtNPv94Nb+YAfH7HH3wUYGJzgJHMgGek1x/S6D4VAMTu
Wnad4By9rpFEnzT0x32i
=Jy7K
-----END PGP SIGNATURE-----
Merge tag 'regmap-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"Quite a few enhancements this time around, helpers and diagnostics for
the most part which is good to see:
- Addition of table based lookups for the register access checks from
Davide Ciminaghi, making life easier for drivers with big blocks of
similar registers.
- Allow drivers to get the irqdomain for regmap irq_chips, allowing
the domain to be used with other APIs.
- Debug improvements for paged register maps.
- Performance improvments for some of the diagnostic infrastructure,
very helpful for devices with large register maps."
* tag 'regmap-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: debugfs: Cache offsets of valid regions for dump
regmap: debugfs: Factor out initial seek
regmap: debugfs: Avoid overflows for very small reads
regmap: Cache register and value sizes for debugfs
regmap: introduce tables for readable/writeable/volatile/precious checks
regmap: core: Report registers in hex when we can't cache
regmap: Fix printing of size_t variable
regmap: make lock/unlock functions customizable
regmap: silence GCC warning
regmap: Split raw writes that cross window boundaries
regmap: Make return code checks consistent
regmap: Factor range lookup out of page selection
regmap: Provide debugfs read of register ranges
regmap: Factor out debugfs register read
regmap: Allow ranges to be named
regmap: When we sanity check during range adds say what errors we find
regmap: Rename n_ranges to num_ranges
regmap: irq: Allow users to retrieve the irq_domain
to hold multiple records.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQxjFJAAoJEKurIx+X31iBNssQAKCjiXe7dT8DigoKSPj3w1sJ
8xcuJKmlZ5ykgElfQ1Lcfuz0KuMT8SZfBHrRCL389Yvtxq71NL9pMX1OyCqyfVyO
uuKWD/+jVc5d11F15hC2l8zIU7v2S3TXQ64rZEgDDFXIPlTbEcEBy92arpq6Avj3
6ioR+rf4jvOeC66keIcDOIl6r4tVn2f16B9Z6ffSOit0dVeRAOc7bB9LLaohETSY
rCSKHgZLHBZRQv2yGKS3LB6KSL2DVm+lgNhzAVw4kEJNTJl04clK854L8EM7WfGj
qREsvSUB4H4I2XMBW+HCU/6+4c4alSEr/2c8+xBeynL4+I2y+VLwySkwE+8DPgEV
BjCuyjSmpZVuVz6R28hcJIbkK0sN1gntXitb2V157+89f3gVEMYwKaY787TuNl9g
oWYjjeOk66r5xshdVvCnFIgZxgFItznn+kgRkNq0clGH+nWVsZfuYw/q2PLm9xoM
TJR+ZZwMJC25TE99rh9CqVc0kuQi5SXfqjpd+fW8oJ2kSxj4/7SqmOpmK7oWLFNi
kf0xS1yVsQB5mRx3nvERvA36lzNSiQRFXnMjh84be4NX8dYqiGoJQjOh/laemjxd
8hpoMVuUMtXLC4roejyyLwddtTLshWm2ByULAHJRJpuUKzichkaqhuvvkMUjKfDj
NunMg/OfNs78sTlNSATf
=NAdA
-----END PGP SIGNATURE-----
Merge tag 'please-pull-pstore_mevent' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux
Pull pstore fixes from Tony Luck:
"Patch series to allow EFI variable backend to pstore to hold multiple
records."
* tag 'please-pull-pstore_mevent' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
efi_pstore: Add a format check for an existing variable name at erasing time
efi_pstore: Add a format check for an existing variable name at reading time
efi_pstore: Add a sequence counter to a variable name
efi_pstore: Add ctime to argument of erase callback
efi_pstore: Remove a logic erasing entries from a write callback to hold multiple logs
efi_pstore: Add a logic erasing entries to an erase callback
efi_pstore: Check remaining space with QueryVariableInfo() before writing data
It should not be necessary to add IDs for HID sensor hubs to lists in
hid-core.c and hid-sensor-hub.c. So instead of a whitelist, autodetect such USB
HID sensor hubs, based on a collection of type physical inside a useage page of
type sensor. If some sensor hubs stil must be usable as raw devices, a
blacklist might be created.
Signed-off-by: Alexander Holler <holler@ahsoftware.de>
Acked-by: "Pandruvada, Srinivas" <srinivas.pandruvada@intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Pull ARM OMAP serial updates from Russell King:
"This series is a major reworking of the OMAP serial driver code fixing
various bugs in the hardware-assisted flow control, extending up into
serial_core for a couple of issues. These fixes have been done as a
set of progressive changes and transformations in the hope that no new
bugs will be introduced by this series.
The problems are many-fold, from the driver not being informed about
updated settings, to the driver not knowing what the intentions of the
upper layers are.
The first four patches tackle the serial_core layer, allowing it to
provide the necessary information to drivers, and the remaining
patches allow the OMAP serial driver to take advantage of this.
This brings hardware assisted RTS/CTS and XON/OFF flow control into a
useful state.
These patches have been in linux-next for most of the last cycle;
indeed they predate the previous merge window. They've also been
posted to the OMAP people."
* 'omap-serial' of git://git.linaro.org/people/rmk/linux-arm: (21 commits)
SERIAL: omap: fix hardware assisted flow control
SERIAL: omap: simplify (2)
SERIAL: omap: move xon/xoff setting earlier
SERIAL: omap: always set TCR
SERIAL: omap: simplify
SERIAL: omap: don't read back LCR/MCR/EFR
SERIAL: omap: serial_omap_configure_xonxoff() contents into set_termios
SERIAL: omap: configure xon/xoff before setting modem control lines
SERIAL: omap: remove OMAP_UART_SYSC_RESET and OMAP_UART_FIFO_CLR
SERIAL: omap: move driver private definitions and structures to driver
SERIAL: omap: remove 'irq_pending' bitfield
SERIAL: omap: fix MCR TCRTLR bit handling
SERIAL: omap: fix set_mctrl() breakage
SERIAL: omap: no need to re-read EFR
SERIAL: omap: remove setting of EFR SCD bit
SERIAL: omap: allow hardware assisted IXANY mode to be disabled
SERIAL: omap: allow hardware assisted rts/cts modes to be disabled
SERIAL: core: add throttle/unthrottle callbacks for hardware assisted flow control
SERIAL: core: add hardware assisted h/w flow control support
SERIAL: core: add hardware assisted s/w flow control support
...
Conflicts:
drivers/tty/serial/omap-serial.c
The merge is merely to fix conflicts before sending a pull request.
Conflicts:
drivers/power/ab8500_btemp.c
drivers/power/ab8500_charger.c
drivers/power/ab8500_fg.c
drivers/power/abx500_chargalg.c
drivers/power/max8925_power.c
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Pull core timer changes from Ingo Molnar:
"It contains continued generic-NOHZ work by Frederic and smaller
cleanups."
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
time: Kill xtime_lock, replacing it with jiffies_lock
clocksource: arm_generic: use this_cpu_ptr per-cpu helper
clocksource: arm_generic: use integer math helpers
time/jiffies: Make clocksource_jiffies static
clocksource: clean up parse_pmtmr()
tick: Correct the comments for tick_sched_timer()
tick: Conditionally build nohz specific code in tick handler
tick: Consolidate tick handling for high and low res handlers
tick: Consolidate timekeeping handling code
Pull scheduler updates from Ingo Molnar:
"The biggest change affects group scheduling: we now track the runnable
average on a per-task entity basis, allowing a smoother, exponential
decay average based load/weight estimation instead of the previous
binary on-the-runqueue/off-the-runqueue load weight method.
This will inevitably disturb workloads that were in some sort of
borderline balancing state or unstable equilibrium, so an eye has to
be kept on regressions.
For that reason the new load average is only limited to group
scheduling (shares distribution) at the moment (which was also hurting
the most from the prior, crude weight calculation and whose scheduling
quality wins most from this change) - but we plan to extend this to
regular SMP balancing as well in the future, which will simplify and
speed up things a bit.
Other changes involve ongoing preparatory work to extend NOHZ to the
scheduler as well, eventually allowing completely irq-free user-space
execution."
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
Revert "sched/autogroup: Fix crash on reboot when autogroup is disabled"
cputime: Comment cputime's adjusting code
cputime: Consolidate cputime adjustment code
cputime: Rename thread_group_times to thread_group_cputime_adjusted
cputime: Move thread_group_cputime() to sched code
vtime: Warn if irqs aren't disabled on system time accounting APIs
vtime: No need to disable irqs on vtime_account()
vtime: Consolidate a bit the ctx switch code
vtime: Explicitly account pending user time on process tick
vtime: Remove the underscore prefix invasion
sched/autogroup: Fix crash on reboot when autogroup is disabled
cputime: Separate irqtime accounting from generic vtime
cputime: Specialize irq vtime hooks
kvm: Directly account vtime to system on guest switch
vtime: Make vtime_account_system() irqsafe
vtime: Gather vtime declarations to their own header file
sched: Describe CFS load-balancer
sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking
sched: Make __update_entity_runnable_avg() fast
sched: Update_cfs_shares at period edge
...
Pull perf updates from Ingo Molnar:
"Lots of activity:
211 files changed, 8328 insertions(+), 4116 deletions(-)
most of it on the tooling side.
Main changes:
* ftrace enhancements and fixes from Steve Rostedt.
* uprobes fixes, cleanups and preparation for the ARM port from Oleg
Nesterov.
* UAPI fixes, from David Howels - prepares the arch/x86 UAPI
transition
* Separate perf tests into multiple objects, one per test, from Jiri
Olsa.
* Make hardware event translations available in sysfs, from Jiri
Olsa.
* Fixes to /proc/pid/maps parsing, preparatory to supporting data
maps, from Namhyung Kim
* Implement ui_progress for GTK, from Namhyung Kim
* Add framework for automated perf_event_attr tests, where tools with
different command line options will be run from a 'perf test', via
python glue, and the perf syscall will be intercepted to verify
that the perf_event_attr fields set by the tool are those expected,
from Jiri Olsa
* Add a 'link' method for hists, so that we can have the leader with
buckets for all the entries in all the hists. This new method is
now used in the default 'diff' output, making the sum of the
'baseline' column be 100%, eliminating blind spots.
* libtraceevent fixes for compiler warnings trying to make perf it
build on some distros, like fedora 14, 32-bit, some of the warnings
really pointed to real bugs.
* Add a browser for 'perf script' and make it available from the
report and annotate browsers. It does filtering to find the
scripts that handle events found in the perf.data file used. From
Feng Tang
* perf inject changes to allow showing where a task sleeps, from
Andrew Vagin.
* Makefile improvements from Namhyung Kim.
* Add --pre and --post command hooks in 'stat', from Peter Zijlstra.
* Don't stop synthesizing threads when one vanishes, this is for the
existing threads when we start a tool like trace.
* Use sched:sched_stat_runtime to provide a thread summary, this
produces the same output as the 'trace summary' subcommand of
tglx's original "trace" tool.
* Support interrupted syscalls in 'trace'
* Add an event duration column and filter in 'trace'.
* There are references to the man pages in some tools, so try to
build Documentation when installing, warning the user if that is
not possible, from Borislav Petkov.
* Give user better message if precise is not supported, from David
Ahern.
* Try to find cross-built objdump path by using the session
environment information in the perf.data file header, from Irina
Tirdea, original patch and idea by Namhyung Kim.
* Diplays more output on features check for make V=1, so that one can
figure out what is happening by looking at gcc output, etc. From
Jiri Olsa.
* Add on_exit implementation for systems without one, e.g. Android,
from Bernhard Rosenkraenzer.
* Only process events for vcpus of interest, helps handling large
number of events, from David Ahern.
* Cross compilation fixes for Android, from Irina Tirdea.
* Add documentation on compiling for Android, from Irina Tirdea.
* perf diff improvements from Jiri Olsa.
* Target (task/user/cpu/syswide) handling improvements, from Namhyung
Kim.
* Add support in 'trace' for tracing workload given by command line,
from Namhyung Kim.
* ... and much more."
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (194 commits)
uprobes: Use percpu_rw_semaphore to fix register/unregister vs dup_mmap() race
perf evsel: Introduce is_group_member method
perf powerpc: Use uapi/unistd.h to fix build error
tools: Pass the target in descend
tools: Honour the O= flag when tool build called from a higher Makefile
tools: Define a Makefile function to do subdir processing
perf ui: Always compile browser setup code
perf ui: Add ui_progress__finish()
perf ui gtk: Implement ui_progress functions
perf ui: Introduce generic ui_progress helper
perf ui tui: Move progress.c under ui/tui directory
perf tools: Add basic event modifier sanity check
perf tools: Omit group members from perf_evlist__disable/enable
perf tools: Ensure single disable call per event in record comand
perf tools: Fix 'disabled' attribute config for record command
perf tools: Fix attributes for '{}' defined event groups
perf tools: Use sscanf for parsing /proc/pid/maps
perf tools: Add gtk.<command> config option for launching GTK browser
perf tools: Fix compile error on NO_NEWT=1 build
perf hists: Initialize all of he->stat with zeroes
...
Pull irq fixes from Ingo Molnar:
"Affinity fixes and a nested threaded IRQ handling fix."
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Always force thread affinity
irq: Set CPU affinity right on thread creation
genirq: Provide means to retrigger parent
Pull RCU update from Ingo Molnar:
"The major features of this tree are:
1. A first version of no-callbacks CPUs. This version prohibits
offlining CPU 0, but only when enabled via CONFIG_RCU_NOCB_CPU=y.
Relaxing this constraint is in progress, but not yet ready
for prime time. These commits were posted to LKML at
https://lkml.org/lkml/2012/10/30/724.
2. Changes to SRCU that allows statically initialized srcu_struct
structures. These commits were posted to LKML at
https://lkml.org/lkml/2012/10/30/296.
3. Restructuring of RCU's debugfs output. These commits were posted
to LKML at https://lkml.org/lkml/2012/10/30/341.
4. Additional CPU-hotplug/RCU improvements, posted to LKML at
https://lkml.org/lkml/2012/10/30/327.
Note that the commit eliminating __stop_machine() was judged to
be too-high of risk, so is deferred to 3.9.
5. Changes to RCU's idle interface, most notably a new module
parameter that redirects normal grace-period operations to
their expedited equivalents. These were posted to LKML at
https://lkml.org/lkml/2012/10/30/739.
6. Additional diagnostics for RCU's CPU stall warning facility,
posted to LKML at https://lkml.org/lkml/2012/10/30/315.
The most notable change reduces the
default RCU CPU stall-warning time from 60 seconds to 21 seconds,
so that it once again happens sooner than the softlockup timeout.
7. Documentation updates, which were posted to LKML at
https://lkml.org/lkml/2012/10/30/280.
A couple of late-breaking changes were posted at
https://lkml.org/lkml/2012/11/16/634 and
https://lkml.org/lkml/2012/11/16/547.
8. Miscellaneous fixes, which were posted to LKML at
https://lkml.org/lkml/2012/10/30/309.
9. Finally, a fix for an lockdep-RCU splat was posted to LKML
at https://lkml.org/lkml/2012/11/7/486."
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (49 commits)
context_tracking: New context tracking susbsystem
sched: Mark RCU reader in sched_show_task()
rcu: Separate accounting of callbacks from callback-free CPUs
rcu: Add callback-free CPUs
rcu: Add documentation for the new rcuexp debugfs trace file
rcu: Update documentation for TREE_RCU debugfs tracing
rcu: Reduce default RCU CPU stall warning timeout
rcu: Fix TINY_RCU rcu_is_cpu_rrupt_from_idle check
rcu: Clarify memory-ordering properties of grace-period primitives
rcu: Add new rcutorture module parameters to start/end test messages
rcu: Remove list_for_each_continue_rcu()
rcu: Fix batch-limit size problem
rcu: Add tracing for synchronize_sched_expedited()
rcu: Remove old debugfs interfaces and also RCU flavor name
rcu: split 'rcuhier' to each flavor
rcu: split 'rcugp' to each flavor
rcu: split 'rcuboost' to each flavor
rcu: split 'rcubarrier' to each flavor
rcu: Fix tracing formatting
rcu: Remove the interface "rcudata.csv"
...
Merge misc updates from Andrew Morton:
"About half of most of MM. Going very early this time due to
uncertainty over the coreautounifiednumasched things. I'll send the
other half of most of MM tomorrow. The rest of MM awaits a slab merge
from Pekka."
* emailed patches from Andrew Morton: (71 commits)
memory_hotplug: ensure every online node has NORMAL memory
memory_hotplug: handle empty zone when online_movable/online_kernel
mm, memory-hotplug: dynamic configure movable memory and portion memory
drivers/base/node.c: cleanup node_state_attr[]
bootmem: fix wrong call parameter for free_bootmem()
avr32, kconfig: remove HAVE_ARCH_BOOTMEM
mm: cma: remove watermark hacks
mm: cma: skip watermarks check for already isolated blocks in split_free_page()
mm, oom: fix race when specifying a thread as the oom origin
mm, oom: change type of oom_score_adj to short
mm: cleanup register_node()
mm, mempolicy: remove duplicate code
mm/vmscan.c: try_to_freeze() returns boolean
mm: introduce putback_movable_pages()
virtio_balloon: introduce migration primitives to balloon pages
mm: introduce compaction and migration for ballooned pages
mm: introduce a common interface for balloon pages mobility
mm: redefine address_space.assoc_mapping
mm: adjust address_space_operations.migratepage() return code
arch/sparc/kernel/sys_sparc_64.c: s/COLOUR/COLOR/
...
Add online_movable and online_kernel for logic memory hotplug. This is
the dynamic version of "movablecore" & "kernelcore".
We have the same reason to introduce it as to introduce "movablecore" &
"kernelcore". It has the same motive as "movablecore" & "kernelcore", but
it is dynamic/running-time:
o We can configure memory as kernelcore or movablecore after boot.
Userspace workload is increased, we need more hugepage, we can't use
"online_movable" to add memory and allow the system use more
THP(transparent-huge-page), vice-verse when kernel workload is increase.
Also help for virtualization to dynamic configure host/guest's memory,
to save/(reduce waste) memory.
Memory capacity on Demand
o When a new node is physically online after boot, we need to use
"online_movable" or "online_kernel" to configure/portion it as we
expected when we logic-online it.
This configuration also helps for physically-memory-migrate.
o all benefit as the same as existed "movablecore" & "kernelcore".
o Preparing for movable-node, which is very important for power-saving,
hardware partitioning and high-available-system(hardware fault
management).
(Note, we don't introduce movable-node here.)
Action behavior:
When a memoryblock/memorysection is onlined by "online_movable", the kernel
will not have directly reference to the page of the memoryblock,
thus we can remove that memory any time when needed.
When it is online by "online_kernel", the kernel can use it.
When it is online by "online", the zone type doesn't changed.
Current constraints:
Only the memoryblock which is adjacent to the ZONE_MOVABLE
can be online from ZONE_NORMAL to ZONE_MOVABLE.
[akpm@linux-foundation.org: use min_t, cleanups]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It is strange that alloc_bootmem() returns a virtual address and
free_bootmem() requires a physical address. Anyway, free_bootmem()'s
first parameter should be physical address.
There are some call sites for free_bootmem() with virtual address. So fix
them.
[akpm@linux-foundation.org: improve free_bootmem() and free_bootmem_pate() documentation]
Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commits 2139cbe627 ("cma: fix counting of isolated pages") and
d95ea5d18e ("cma: fix watermark checking") introduced a reliable
method of free page accounting when memory is being allocated from CMA
regions, so the workaround introduced earlier by commit 49f223a9cd
("mm: trigger page reclaim in alloc_contig_range() to stabilise
watermarks") can be finally removed.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Mel Gorman <mel@csn.ul.ie>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
test_set_oom_score_adj() and compare_swap_oom_score_adj() are used to
specify that current should be killed first if an oom condition occurs in
between the two calls.
The usage is
short oom_score_adj = test_set_oom_score_adj(OOM_SCORE_ADJ_MAX);
...
compare_swap_oom_score_adj(OOM_SCORE_ADJ_MAX, oom_score_adj);
to store the thread's oom_score_adj, temporarily change it to the maximum
score possible, and then restore the old value if it is still the same.
This happens to still be racy, however, if the user writes
OOM_SCORE_ADJ_MAX to /proc/pid/oom_score_adj in between the two calls.
The compare_swap_oom_score_adj() will then incorrectly reset the old value
prior to the write of OOM_SCORE_ADJ_MAX.
To fix this, introduce a new oom_flags_t member in struct signal_struct
that will be used for per-thread oom killer flags. KSM and swapoff can
now use a bit in this member to specify that threads should be killed
first in oom conditions without playing around with oom_score_adj.
This also allows the correct oom_score_adj to always be shown when reading
/proc/pid/oom_score.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The maximum oom_score_adj is 1000 and the minimum oom_score_adj is -1000,
so this range can be represented by the signed short type with no
functional change. The extra space this frees up in struct signal_struct
will be used for per-thread oom kill flags in the next patch.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
register_node() is defined as extern in include/linux/node.h. But the
function is only called from register_one_node() in driver/base/node.c.
So the patch defines register_node() as static.
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The PATCH "mm: introduce compaction and migration for virtio ballooned pages"
hacks around putback_lru_pages() in order to allow ballooned pages to be
re-inserted on balloon page list as if a ballooned page was like a LRU page.
As ballooned pages are not legitimate LRU pages, this patch introduces
putback_movable_pages() to properly cope with cases where the isolated
pageset contains ballooned pages and LRU pages, thus fixing the mentioned
inelegant hack around putback_lru_pages().
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Memory fragmentation introduced by ballooning might reduce significantly
the number of 2MB contiguous memory blocks that can be used within a guest,
thus imposing performance penalties associated with the reduced number of
transparent huge pages that could be used by the guest workload.
This patch introduces a common interface to help a balloon driver on
making its page set movable to compaction, and thus allowing the system
to better leverage the compation efforts on memory defragmentation.
[akpm@linux-foundation.org: use PAGE_FLAGS_CHECK_AT_PREP, s/__balloon_page_flags/page_flags_cleared/, small cleanups]
[rientjes@google.com: allow balloon compaction for any system with memory compaction enabled, which is the defconfig]
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Overhaul struct address_space.assoc_mapping renaming it to
address_space.private_data and its type is redefined to void*. By this
approach we consistently name the .private_* elements from struct
address_space as well as allow extended usage for address_space
association with other data structures through ->private_data.
Also, all users of old ->assoc_mapping element are converted to reflect
its new name and type change (->private_data).
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Memory fragmentation introduced by ballooning might reduce significantly
the number of 2MB contiguous memory blocks that can be used within a
guest, thus imposing performance penalties associated with the reduced
number of transparent huge pages that could be used by the guest workload.
This patch-set follows the main idea discussed at 2012 LSFMMS session:
"Ballooning for transparent huge pages" -- http://lwn.net/Articles/490114/
to introduce the required changes to the virtio_balloon driver, as well as
the changes to the core compaction & migration bits, in order to make
those subsystems aware of ballooned pages and allow memory balloon pages
become movable within a guest, thus avoiding the aforementioned
fragmentation issue
Following are numbers that prove this patch benefits on allowing
compaction to be more effective at memory ballooned guests.
Results for STRESS-HIGHALLOC benchmark, from Mel Gorman's mmtests suite,
running on a 4gB RAM KVM guest which was ballooning 512mB RAM in 64mB
chunks, at every minute (inflating/deflating), while test was running:
===BEGIN stress-highalloc
STRESS-HIGHALLOC
highalloc-3.7 highalloc-3.7
rc4-clean rc4-patch
Pass 1 55.00 ( 0.00%) 62.00 ( 7.00%)
Pass 2 54.00 ( 0.00%) 62.00 ( 8.00%)
while Rested 75.00 ( 0.00%) 80.00 ( 5.00%)
MMTests Statistics: duration
3.7 3.7
rc4-clean rc4-patch
User 1207.59 1207.46
System 1300.55 1299.61
Elapsed 2273.72 2157.06
MMTests Statistics: vmstat
3.7 3.7
rc4-clean rc4-patch
Page Ins 3581516 2374368
Page Outs 11148692 10410332
Swap Ins 80 47
Swap Outs 3641 476
Direct pages scanned 37978 33826
Kswapd pages scanned 1828245 1342869
Kswapd pages reclaimed 1710236 1304099
Direct pages reclaimed 32207 31005
Kswapd efficiency 93% 97%
Kswapd velocity 804.077 622.546
Direct efficiency 84% 91%
Direct velocity 16.703 15.682
Percentage direct scans 2% 2%
Page writes by reclaim 79252 9704
Page writes file 75611 9228
Page writes anon 3641 476
Page reclaim immediate 16764 11014
Page rescued immediate 0 0
Slabs scanned 2171904 2152448
Direct inode steals 385 2261
Kswapd inode steals 659137 609670
Kswapd skipped wait 1 69
THP fault alloc 546 631
THP collapse alloc 361 339
THP splits 259 263
THP fault fallback 98 50
THP collapse fail 20 17
Compaction stalls 747 499
Compaction success 244 145
Compaction failures 503 354
Compaction pages moved 370888 474837
Compaction move failure 77378 65259
===END stress-highalloc
This patch:
Introduce MIGRATEPAGE_SUCCESS as the default return code for
address_space_operations.migratepage() method and documents the expected
return code for the same method in failure cases.
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implement vm_unmapped_area() using the rb_subtree_gap and highest_vm_end
information to look up for suitable virtual address space gaps.
struct vm_unmapped_area_info is used to define the desired allocation
request:
- lowest or highest possible address matching the remaining constraints
- desired gap length
- low/high address limits that the gap must fit into
- alignment mask and offset
Also update the generic arch_get_unmapped_area[_topdown] functions to make
use of vm_unmapped_area() instead of implementing a brute force search.
[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Michel Lespinasse <walken@google.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The kernel walks the VMA rbtree in various places, including the page
fault path. However, the vm_rb node spanned two cache lines, on 64 bit
systems with 64 byte cache lines (most x86 systems).
Rearrange vm_area_struct a little, so all the information we need to do a
VMA tree walk is in the first cache line.
[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>