Commit Graph

1227 Commits

Author SHA1 Message Date
Gang He d750c42ac2 ocfs2: add feature document for online file check
This document will describe OCFS2 online file check feature.  OCFS2 is
often used in high-availaibility systems.  However, OCFS2 usually
converts the filesystem to read-only when encounters an error.  This may
not be necessary, since turning the filesystem read-only would affect
other running processes as well, decreasing availability.

Then, a mount option (errors=continue) is introduced, which would return
the -EIO errno to the calling process and terminate furhter processing
so that the filesystem is not corrupted further.  The filesystem is not
converted to read-only, and the problematic file's inode number is
reported in the kernel log.  The user can try to check/fix this file via
online filecheck feature.

Signed-off-by: Gang He <ghe@suse.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Linus Torvalds 968f3e374f Merge branch 'for-linus-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs updates from Chris Mason:
 "We have a good sized cleanup of our internal read ahead code, and the
  first series of commits from Chandan to enable PAGE_SIZE > sectorsize

  Otherwise, it's a normal series of cleanups and fixes, with many
  thanks to Dave Sterba for doing most of the patch wrangling this time"

* 'for-linus-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (82 commits)
  btrfs: make sure we stay inside the bvec during __btrfs_lookup_bio_sums
  btrfs: Fix misspellings in comments.
  btrfs: Print Warning only if ENOSPC_DEBUG is enabled
  btrfs: scrub: silence an uninitialized variable warning
  btrfs: move btrfs_compression_type to compression.h
  btrfs: rename btrfs_print_info to btrfs_print_mod_info
  Btrfs: Show a warning message if one of objectid reaches its highest value
  Documentation: btrfs: remove usage specific information
  btrfs: use kbasename in btrfsic_mount
  Btrfs: do not collect ordered extents when logging that inode exists
  Btrfs: fix race when checking if we can skip fsync'ing an inode
  Btrfs: fix listxattrs not listing all xattrs packed in the same item
  Btrfs: fix deadlock between direct IO reads and buffered writes
  Btrfs: fix extent_same allowing destination offset beyond i_size
  Btrfs: fix file loss on log replay after renaming a file and fsync
  Btrfs: fix unreplayable log after snapshot delete + parent dir fsync
  Btrfs: fix lockdep deadlock warning due to dev_replace
  btrfs: drop unused argument in btrfs_ioctl_get_supported_features
  btrfs: add GET_SUPPORTED_FEATURES to the control device ioctls
  btrfs: change max_inline default to 2048
  ...
2016-03-21 18:12:42 -07:00
Linus Torvalds 814a2bf957 Merge branch 'akpm' (patches from Andrew)
Merge second patch-bomb from Andrew Morton:

 - a couple of hotfixes

 - the rest of MM

 - a new timer slack control in procfs

 - a couple of procfs fixes

 - a few misc things

 - some printk tweaks

 - lib/ updates, notably to radix-tree.

 - add my and Nick Piggin's old userspace radix-tree test harness to
   tools/testing/radix-tree/.  Matthew said it was a godsend during the
   radix-tree work he did.

 - a few code-size improvements, switching to __always_inline where gcc
   screwed up.

 - partially implement character sets in sscanf

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits)
  sscanf: implement basic character sets
  lib/bug.c: use common WARN helper
  param: convert some "on"/"off" users to strtobool
  lib: add "on"/"off" support to kstrtobool
  lib: update single-char callers of strtobool()
  lib: move strtobool() to kstrtobool()
  include/linux/unaligned: force inlining of byteswap operations
  include/uapi/linux/byteorder, swab: force inlining of some byteswap operations
  include/asm-generic/atomic-long.h: force inlining of some atomic_long operations
  usb: common: convert to use match_string() helper
  ide: hpt366: convert to use match_string() helper
  ata: hpt366: convert to use match_string() helper
  power: ab8500: convert to use match_string() helper
  power: charger_manager: convert to use match_string() helper
  drm/edid: convert to use match_string() helper
  pinctrl: convert to use match_string() helper
  device property: convert to use match_string() helper
  lib/string: introduce match_string() helper
  radix-tree tests: add test for radix_tree_iter_next
  radix-tree tests: add regression3 test
  ...
2016-03-18 19:26:54 -07:00
Christoph Hellwig f99d4fbdae nfsd: add SCSI layout support
This is a simple extension to the block layout driver to use SCSI
persistent reservations for access control and fencing, as well as
SCSI VPD pages for device identification.

For this we need to pass the nfs4_client to the proc_getdeviceinfo method
to generate the reservation key, and add a new fence_client method
to allow for fence actions in the layout driver.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-03-18 11:42:53 -04:00
Linus Torvalds 364e8dd9d6 Configfs changes for the 4.6 merge window:
- A large patch from me to simplify setting up the list of default
    groups by actually implementing it as a list instead of an array.
  - a small Y2083 prep patch from Deepa Dinamani.  Probably doesn't matter
    on it's own, but it seems like he is trying to get rid of all CURRENT_TIME
    uses in file systems, which is a worthwhile goal.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW6Cz6AAoJEA+eU2VSBFGDmNYP/AzJuVdkXjOkzmAl0SjwS0UC
 b/gTF0Z0jAmXX8QTf0NtdNajHweYyY4PVvyuUYojO/Y9bgJigRC6gHIUviq8TLhO
 JR1EUJ3RNoWFZSHeEGTM4q+kSg3GkZ83WixeBiMkIZo7QgPXU2YB0mzErpdcID3N
 +KVnoVU+asVQi656UIDNZ1SawTAGog+tIMIgnM4vmL0Dd+9yN4pYhAmRLLS0C83P
 DPci/oVx1a3IjWAkmz24qtb9ht/SA+IBwyFPltg/gdn5OgJL9Vr1naW5mkqMhoPF
 PUBfX9YYizMwNMYuchng6JqyWlZBjXFr6iqi401vFJcILeq27As5Kc9adfDOEvVC
 V/dWCmTyMlHX507t+lC7kTa6OaHAZKA5scCHA6dgpQIvGfiaMNNu7MW8C6p0HqwY
 rf7na7S2fAu5zCyIRVPK//YMNbRHh2AoclzpK7Sw0NCV5jBlXZOdDJcSb4jQsVF7
 Yy84EqcebvF4ocaFRzwA/ZHNxz65l5Qu7brmOu6pTliQuQED1fop5z92RXkw2e9y
 rSIgzMCL5IoAUkYtoO1jzAQXzyySAb3QDpwCaBdZLzN4MbRF/dUxZDkOePKTaVft
 ckNXj5AVzvLYlpkmkhQ+bqsh91ayFH2/gw9Kt38i1yjzNLhsccZwq9ja5ifPlHLQ
 nOFiane31yp3Zhac8drb
 =9HqT
 -----END PGP SIGNATURE-----

Merge tag 'configfs-for-linus' of git://git.infradead.org/users/hch/configfs

Pull configfs updates from Christoph Hellwig:

 - A large patch from me to simplify setting up the list of default
   groups by actually implementing it as a list instead of an array.

 - a small Y2083 prep patch from Deepa Dinamani.  Probably doesn't
   matter on it's own, but it seems like he is trying to get rid of all
   CURRENT_TIME uses in file systems, which is a worthwhile goal.

* tag 'configfs-for-linus' of git://git.infradead.org/users/hch/configfs:
  configfs: switch ->default groups to a linked list
  configfs: Replace CURRENT_TIME by current_fs_time()
2016-03-17 16:25:46 -07:00
John Stultz 5de23d435e proc: add /proc/<pid>/timerslack_ns interface
This patch provides a proc/PID/timerslack_ns interface which exposes a
task's timerslack value in nanoseconds and allows it to be changed.

This allows power/performance management software to set timer slack for
other threads according to its policy for the thread (such as when the
thread is designated foreground vs.  background activity)

If the value written is non-zero, slack is set to that value.  Otherwise
sets it to the default for the thread.

This interface checks that the calling task has permissions to to use
PTRACE_MODE_ATTACH_FSCREDS on the target task, so that we can ensure
arbitrary apps do not change the timer slack for other apps.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Oren Laadan <orenl@cellrox.com>
Cc: Ruchi Kandoi <kandoiruchi@google.com>
Cc: Rom Lemarchand <romlem@android.com>
Cc: Android Kernel Team <kernel-team@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17 15:09:34 -07:00
Linus Torvalds 96b9b1c956 TTY/Serial patches for 4.6-rc1
Here's the big tty/serial driver pull request for 4.6-rc1.
 
 Lots of changes in here, Peter has been on a tear again, with lots of
 refactoring and bugs fixes, many thanks to the great work he has been
 doing.  Lots of driver updates and fixes as well, full details in the
 shortlog.
 
 All have been in linux-next for a while with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlbp8z8ACgkQMUfUDdst+ym1vwCgnOOCORaZyeQ4QrcxPAK5pHFn
 VrMAoNHvDgNYtG+Hmzv25Lgp3HnysPin
 =MLRG
 -----END PGP SIGNATURE-----

Merge tag 'tty-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial updates from Greg KH:
 "Here's the big tty/serial driver pull request for 4.6-rc1.

  Lots of changes in here, Peter has been on a tear again, with lots of
  refactoring and bugs fixes, many thanks to the great work he has been
  doing.  Lots of driver updates and fixes as well, full details in the
  shortlog.

  All have been in linux-next for a while with no reported issues"

* tag 'tty-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (220 commits)
  serial: 8250: describe CONFIG_SERIAL_8250_RSA
  serial: samsung: optimize UART rx fifo access routine
  serial: pl011: add mark/space parity support
  serial: sa1100: make sa1100_register_uart_fns a function
  tty: serial: 8250: add MOXA Smartio MUE boards support
  serial: 8250: convert drivers to use up_to_u8250p()
  serial: 8250/mediatek: fix building with SERIAL_8250=m
  serial: 8250/ingenic: fix building with SERIAL_8250=m
  serial: 8250/uniphier: fix modular build
  Revert "drivers/tty/serial: make 8250/8250_ingenic.c explicitly non-modular"
  Revert "drivers/tty/serial: make 8250/8250_mtk.c explicitly non-modular"
  serial: mvebu-uart: initial support for Armada-3700 serial port
  serial: mctrl_gpio: Add missing module license
  serial: ifx6x60: avoid uninitialized variable use
  tty/serial: at91: fix bad offset for UART timeout register
  tty/serial: at91: restore dynamic driver binding
  serial: 8250: Add hardware dependency to RT288X option
  TTY, devpts: document pty count limiting
  tty: goldfish: support platform_device with id -1
  drivers: tty: goldfish: Add device tree bindings
  ...
2016-03-17 13:53:25 -07:00
Linus Torvalds 37aa7319cd Another relatively boring cycle for the docs tree: typo fixes, translation
updates, etc.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW6IqKAAoJEI3ONVYwIuV6zawP/2NG5Rh9a/6B9YbtVsXZrWD+
 wCK80j+tbH9yAba54GjdJ2nvMKSYpC62ti4sPwtdSllvd0/e2ZnS4ZLa9sSoMDsH
 GK10mYhUcosDfeNCadT66Wl9o8ICu2QqNvGUtjI1331dcjAKBxLjJfaVQ9nddcdP
 Ksr8+ATr1rQE8YfxdprOnaX06C0FKqaQfRWdBl4FNDh6pb6eeXWnP56A+m3zi7e8
 R2uJGlHwxq3qS27JkQDwiWFb7FVPt1H6bfM9DcgiGfItfgoosW1cExxND4MoqEZT
 8ATE1H4Zbn6QOkq88Wo4vDn3gDIsgy7D7rS69juFNzCajxIOaN8NfPusGFmTCpQU
 cuh6XDYKS2n05RU0tuLB9NxTd+1b50COVHD//dEhjvpF+Z7Ku6yM2LrhTYYvoAsm
 XFYUUrJ8i09D0hllRA8WGE+AYGLvyFV9AE1mJQPepx+9TjBluKuqIoig03kEWfof
 3xXhigFy+WttLQZiqten3YMVSo4xQRNq7FgqZ20dPMF1Exnj1wkwDFrE3SBB3Xcv
 3eN953O12JzL7ja+9/OPMKZt8xs/tzLAScMP4R/nXGG050aecRH/xFFrOlzPDdyQ
 oshon67cMGi5qJdHaOkALqwz89euXv/AMHxGixu1M0zJG81WQm1u6vi9lY7b8Y3q
 ojA8vgsG3AprS9nmxqVh
 =wBKa
 -----END PGP SIGNATURE-----

Merge tag 'docs-for-linus' of git://git.lwn.net/linux

Pul documentation update from Jon Corbet:
 "Another relatively boring cycle for the docs tree: typo fixes,
  translation updates, etc"

* tag 'docs-for-linus' of git://git.lwn.net/linux:
  modsign: Fix documentation on module signing enforcement parameter.
  Doc: nfs: Fix typos in Documentation/filesystems/nfs
  Documentation: kselftest: Remove duplicate word
  doc: fix grammar
  Documentation: Howto: Fixed subtitles style
  Doc: ARM: Fix a typo in clksrc-change-registers.awk
  Documentation/ko_KR: update maintainer information
  Documentation: Fix int/unsigned int comparison
  Documentation: Chinese translation of arm64/silicon-errata.txt
  Documentation:Update Documentation/zh_CN/arm64/booting.txt
  Documentation: HOWTO: remove obsolete info about regression postings
  Doc: ja_JP: Fix a typo in HOWTO
  Doc: i2c: Fix typo in Documentation/i2c
  Doc: DocBook: Fix a typo in device-drivers.tmpl
  Remove "arch" usage in Documentation/features/list-arch.sh
  README: cosmetic fixes
  Documentation/CodingStyle: add space before parenthesis in example macro
  SubmittingPatches: fix spelling of "git send-email"
2016-03-17 12:09:35 -07:00
Mike Marshall ab6652524a Linux 4.5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJW5j4RAAoJEHm+PkMAQRiGhVEH/0qZbM1J+WnCK92bm9+inCnB
 JO2JViGIuCQB5BxljVMil2dzrw85D+dC7+fryr0wVBhhBlr0lXPJGSYCYYTEaI20
 Wco5YlTmjRirUwmxWzBXvB5kvTdIaNfNYDcFch6lbsaLUNgqydNKtk08ckO/4k0D
 AmaShW8swBiXE/RmHuj8H41ksHsnY8W62dlczEaAIfr4kluPX/kKnyXpmpvmZm1j
 sM4fskPlq+Jz5pOXXFsFfrhiBgpSUnwSj1tNwK5+DkmaVnWOkPuwkqLBWqpy4pzm
 GTeDBdf5/ixGxgNsZ2VWtbPnc2wEP7SIcu45MU7QFw5kqwDN2nN63BRVXI5Z5qY=
 =RFx2
 -----END PGP SIGNATURE-----

Orangefs: merge to v4.5

Merge tag 'v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into current

Linux 4.5
2016-03-14 15:39:42 -04:00
David Sterba 6788f5ca64 Documentation: btrfs: remove usage specific information
The document in the kernel sources is yet another palce where the
documentation would need to be updated, while it is not the primary
source. We actively maintain the wiki pages.

Signed-off-by: David Sterba <dsterba@suse.com>
2016-03-11 17:02:09 +01:00
Masanari Iida 0d6f3ebf9e Doc: nfs: Fix typos in Documentation/filesystems/nfs
This patch fix spelling typos found in Documentation/filesystems/nfs

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2016-03-09 16:13:04 -07:00
Javi Merino c22d6bef7c doc: fix grammar
Some minor typos:

  - make is unbindable -> make it unbindable
  - a underlying -> an underlying
  - different version -> different versions

Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2016-03-09 15:33:06 -07:00
Konstantin Khlebnikov 8b253b07e9 TTY, devpts: document pty count limiting
Logic has been changed in kernel 3.4 by commit e9aba5158a
("tty: rework pty count limiting") but still not documented.

Sysctl kernel.pty.max works as global limit, kernel.pty.reserve ptys
are reserved for initial devpts instance (mounted without "newinstance").
Per-instance limit also could be set by mount option "max=%d".

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-07 16:11:14 -08:00
Christoph Hellwig 1ae1602de0 configfs: switch ->default groups to a linked list
Replace the current NULL-terminated array of default groups with a linked
list.  This gets rid of lots of nasty code to size and/or dynamically
allocate the array.

While we're at it also provide a conveniant helper to remove the default
groups.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Felipe Balbi <balbi@kernel.org>		[drivers/usb/gadget]
Acked-by: Joel Becker <jlbec@evilplan.org>
Acked-by: Nicholas Bellinger <nab@linux-iscsi.org>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
2016-03-06 16:11:24 +01:00
Chris Mason c05c5ee5ea Btrfs patchsets for 4.6
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCgAGBQJW0GSnAAoJEMVl1fnXbVg75qAP/0xbZPJtvTgRMSRnARtFJ28w
 vCsxqY+AatNJDuEpg2My/vscZvAVXGcTWjnM8NkXMMKN+oags47QN4qD0cuNv2kI
 JWcz7Ppt3GY6lcQbTj/Ce6N8RPRCNGsU7vxev+sKZ+jjXn+vuc+wKXnyJgaL1qcN
 XhcP2MccrXTVVJXLbGMFoaJXWWfd2i9uJ2MplmjFP7HQi5zP+5t/dsVaAQbc1dqx
 2TqgTJkUEPQqK8geAKom5wdLTmpLSgMWvg1m4lkYpDO89Fi+hFAKeeuJZvNutxVa
 hA0QLrLyZmr4tbZhM1of35Kl7N1uwCzOd8u6xsxurB12bibz67RbQpK+fazlCjKa
 wZJvJV+N3gqgCusLHlXYX0YalQxpWRQiKkjzpMy3Pq4K4soLrw20tQOnnBFhLR1y
 ZwqmZUN33lhFNCIWqLS4BLqDG+Z7Sf2aGhFtspMDjSUJe9gLbIpvH9sW6CexJI2r
 FnxTaVZ08uY0ky1dvZcRDR6zDDbVUpoQKWmwdZpxoEO1eLKjD01VsMOw5zlAaxdc
 a5SxKMVt0Gq56oTPgp0MuLHJr20pxx03yr+yl69VM8R1dAG/y61Dq5DwiFNQ8+J6
 jrX+eVYGBgTNYw/UGb14UPwVjQFFEs/vouphy6MmOVvNz+YZI6thN1uScB0vw7BV
 p/oFts5Fo0ipJgaBzGu4
 =CRdD
 -----END PGP SIGNATURE-----

Merge tag 'for-chris' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.6

Btrfs patchsets for 4.6
2016-03-01 08:13:56 -08:00
Mike Marshall 9f08cfe944 Orangefs: update orangefs.txt
Al Viro has cleaned up the way ops are processed and waited for,
now orangefs.txt has an overview of how it works. Several recent
related commits have added to the comments in the code as well.

Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-02-26 14:39:08 -05:00
Qu Wenruo 96da09192c btrfs: Introduce new mount option to disable tree log replay
Introduce a new mount option "nologreplay" to co-operate with "ro" mount
option to get real readonly mount, like "norecovery" in ext* and xfs.

Since the new parse_options() need to check new flags at remount time,
so add a new parameter for parse_options().

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Tested-by: Austin S. Hemmelgarn <ahferroin7@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-12 15:14:49 +01:00
Qu Wenruo 8dcddfa048 btrfs: Introduce new mount option usebackuproot to replace recovery
Current "recovery" mount option will only try to use backup root.
However the word "recovery" is too generic and may be confusing for some
users.

Here introduce a new and more specific mount option, "usebackuproot" to
replace "recovery" mount option.
"Recovery" will be kept for compatibility reason, but will be
deprecated.

Also, since "usebackuproot" will only affect mount behavior and after
open_ctree() it has nothing to do with the filesystem, so clear the flag
after mount succeeded.

This provides the basis for later unified "norecovery" mount option.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
[ dropped usebackuproot from show_mount, added note about 'recovery' to
  docs ]
Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-12 15:14:14 +01:00
Peter Jones ed8b0de5a3 efi: Make efivarfs entries immutable by default
"rm -rf" is bricking some peoples' laptops because of variables being
used to store non-reinitializable firmware driver data that's required
to POST the hardware.

These are 100% bugs, and they need to be fixed, but in the mean time it
shouldn't be easy to *accidentally* brick machines.

We have to have delete working, and picking which variables do and don't
work for deletion is quite intractable, so instead make everything
immutable by default (except for a whitelist), and make tools that
aren't quite so broad-spectrum unset the immutable flag.

Signed-off-by: Peter Jones <pjones@redhat.com>
Tested-by: Lee, Chun-Yi <jlee@suse.com>
Acked-by: Matthew Garrett <mjg59@coreos.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2016-02-10 16:25:52 +00:00
Konstantin Khlebnikov 30bdbb7800 mm: polish virtual memory accounting
* add VM_STACK as alias for VM_GROWSUP/DOWN depending on architecture
* always account VMAs with flag VM_STACK as stack (as it was before)
* cleanup classifying helpers
* update comments and documentation

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Tested-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-03 08:28:43 -08:00
Johannes Weiner 65376df582 proc: revert /proc/<pid>/maps [stack:TID] annotation
Commit b76437579d ("procfs: mark thread stack correctly in
proc/<pid>/maps") added [stack:TID] annotation to /proc/<pid>/maps.

Finding the task of a stack VMA requires walking the entire thread list,
turning this into quadratic behavior: a thousand threads means a
thousand stacks, so the rendering of /proc/<pid>/maps needs to look at a
million combinations.

The cost is not in proportion to the usefulness as described in the
patch.

Drop the [stack:TID] annotation to make /proc/<pid>/maps (and
/proc/<pid>/numa_maps) usable again for higher thread counts.

The [stack] annotation inside /proc/<pid>/task/<tid>/maps is retained, as
identifying the stack VMA there is an O(1) operation.

Siddesh said:
 "The end users needed a way to identify thread stacks programmatically and
  there wasn't a way to do that.  I'm afraid I no longer remember (or have
  access to the resources that would aid my memory since I changed
  employers) the details of their requirement.  However, I did do this on my
  own time because I thought it was an interesting project for me and nobody
  really gave any feedback then as to its utility, so as far as I am
  concerned you could roll back the main thread maps information since the
  information is available in the thread-specific files"

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com>
Cc: Shaohua Li <shli@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-03 08:28:43 -08:00
Namjae Jeon 28016128d3 Documentation/filesystems/vfat.txt: update the limitation for fat fallocate
Update the limitation for fat fallocate.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Linus Torvalds e535d74bc5 A relatively boring cycle in the docs tree. There's a few kernel-doc
fixes and various document tweaks.
 
 One patch reaches out of the documentation subtree to fix a comment in
 init/do_mounts_rd.c.  There didn't seem to be anybody more appropriate to
 take that one, so I accepted it.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWmmJ8AAoJEI3ONVYwIuV6uqwP/0mnqdxVWo47ohaYJP7q0Soh
 ovJAbfttxKnkmOdGbWcNIJtTiw+MpdF805CYR+2treE0zvEEDodg7BhkDnmKZJ9n
 F1r53JrIj769E1c5ETmWTHcBt3jjtKyQIbBmDr4YTgX91dlKF28o1bMmyDECWIcT
 PktTlPUidDtffKMn3klh6baPCMrTpLJ8aLshBzUrQhrQY8lxcZKAU+98vtFzYofG
 LXCSulMYXumb7XBxErTLQZhmJslD4gaDMh2xkov6ALS8XNHnfoUIFRbArAllNfTf
 LQGJ6Q5qnn58UWi9F/vgDqx7+d1KIPUjBxJR9wfa0w9ggQhA9ly2BSN/fllbiSbp
 yIi1JS4hwBe8H/h577BNC3xjmgVN7mazZsXlS+fg3G16gpv4JdWeRY4efjosFIzQ
 EIJxB8qAovUNqw4s1mzRIJ5B9L7PEK27O6z8N27Fiw4EigtMTFAOC2/GD3ELx4iJ
 p1doiSr+wjfDcFd8kdIUiDKGrTSTXwNy3hUfrhzQyaEjDTJnx3+1+ono1orSazPO
 Fr2RSsC5VzX4IYSuxTMvFSKjN1Iiu8xqwq3IdclHXrBhRvwOF2wpjjQ5Guf0lHBJ
 FLBahSjZqt01kmwFykxoHps+VeSwpoEen6rClBQolfmtYVDTvgRNN46AxK9jZ8T4
 jZmCNNs/mYzrqo/RTnmw
 =u38W
 -----END PGP SIGNATURE-----

Merge tag 'docs-4.5' of git://git.lwn.net/linux

Pull documentation updates from Jon Corbet:
 "A relatively boring cycle in the docs tree.  There's a few kernel-doc
  fixes and various document tweaks.

  One patch reaches out of the documentation subtree to fix a comment in
  init/do_mounts_rd.c.  There didn't seem to be anybody more appropriate
  to take that one, so I accepted it"

* tag 'docs-4.5' of git://git.lwn.net/linux: (29 commits)
  thermal: add description for integral_cutoff unit
  Documentation: update libhugetlbfs site url
  Documentation: Explain pci=conf1,conf2 more verbosely
  DMA-API: fix confusing sentence in Documentation/DMA-API.txt
  Documentation: translations: update linux cross reference link
  Documentation: fix typo in CodingStyle
  init, Documentation: Remove ramdisk_blocksize mentions
  Documentation-getdelays: Apply a recommendation from "checkpatch.pl" in main()
  Documentation: HOWTO: update versions from 3.x to 4.x
  Documentation: remove outdated references from translations
  Doc: treewide: Fix grammar "a" to "an"
  Documentation: cpu-hotplug: Fix sysfs mount instructions
  can-doc: Add hint about getting timestamps
  Fix CFQ I/O scheduler parameter name in documentation
  Documentation: arm: remove dead links from Marvell Berlin docs
  Documentation: HOWTO: update code cross reference link
  Doc: Docbook/iio: Fix typo in iio.tmpl
  DocBook: make index.html generation less verbose by default
  DocBook: Cleanup: remove an unused $(call) line
  DocBook: Add a help message for DOCBOOKS env var
  ...
2016-01-17 11:55:07 -08:00
Linus Torvalds 875fc4f5dd Merge branch 'akpm' (patches from Andrew)
Merge first patch-bomb from Andrew Morton:

 - A few hotfixes which missed 4.4 becasue I was asleep.  cc'ed to
   -stable

 - A few misc fixes

 - OCFS2 updates

 - Part of MM.  Including pretty large changes to page-flags handling
   and to thp management which have been buffered up for 2-3 cycles now.

  I have a lot of MM material this time.

[ It turns out the THP part wasn't quite ready, so that got dropped from
  this series  - Linus ]

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (117 commits)
  zsmalloc: reorganize struct size_class to pack 4 bytes hole
  mm/zbud.c: use list_last_entry() instead of list_tail_entry()
  zram/zcomp: do not zero out zcomp private pages
  zram: pass gfp from zcomp frontend to backend
  zram: try vmalloc() after kmalloc()
  zram/zcomp: use GFP_NOIO to allocate streams
  mm: add tracepoint for scanning pages
  drivers/base/memory.c: fix kernel warning during memory hotplug on ppc64
  mm/page_isolation: use macro to judge the alignment
  mm: fix noisy sparse warning in LIBCFS_ALLOC_PRE()
  mm: rework virtual memory accounting
  include/linux/memblock.h: fix ordering of 'flags' argument in comments
  mm: move lru_to_page to mm_inline.h
  Documentation/filesystems: describe the shared memory usage/accounting
  memory-hotplug: don't BUG() in register_memory_resource()
  hugetlb: make mm and fs code explicitly non-modular
  mm/swapfile.c: use list_for_each_entry_safe in free_swap_count_continuations
  mm: /proc/pid/clear_refs: no need to clear VM_SOFTDIRTY in clear_soft_dirty_pmd()
  mm: make sure isolate_lru_page() is never called for tail page
  vmstat: make vmstat_updater deferrable again and shut down on idle
  ...
2016-01-15 11:41:44 -08:00
Linus Torvalds 63f729cb4a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fix from Al Viro:
 "Don't put symlink bodies in pagecache into highmem"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  Make sure that highmem pages are not added to symlink page cache
2016-01-14 16:03:57 -08:00
Rodrigo Freire 0bc126d460 Documentation/filesystems: describe the shared memory usage/accounting
The Shared Memory accounting support is present in Kernel since commit
4b02108ac1 ("mm: oom analysis: add shmem vmstat") and in userland
free(1) since 2014.  This patch updates the Documentation to reflect
this change.

Signed-off-by: Rodrigo Freire <rfreire@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14 16:00:49 -08:00
Jerome Marchand 8cee852ec5 mm, procfs: breakdown RSS for anon, shmem and file in /proc/pid/status
There are several shortcomings with the accounting of shared memory
(SysV shm, shared anonymous mapping, mapping of a tmpfs file).  The
values in /proc/<pid>/status and <...>/statm don't allow to distinguish
between shmem memory and a shared mapping to a regular file, even though
theirs implication on memory usage are quite different: during reclaim,
file mapping can be dropped or written back on disk, while shmem needs a
place in swap.

Also, to distinguish the memory occupied by anonymous and file mappings,
one has to read the /proc/pid/statm file, which has a field for the file
mappings (again, including shmem) and total memory occupied by these
mappings (i.e.  equivalent to VmRSS in the <...>/status file.  Getting
the value for anonymous mappings only is thus not exactly user-friendly
(the statm file is intended to be rather efficiently machine-readable).

To address both of these shortcomings, this patch adds a breakdown of
VmRSS in /proc/<pid>/status via new fields RssAnon, RssFile and
RssShmem, making use of the previous preparatory patch.  These fields
tell the user the memory occupied by private anonymous pages, mapped
regular files and shmem, respectively.  Other existing fields in /status
and /statm files are left without change.  The /statm file can be
extended in the future, if there's a need for that.

Example (part of) /proc/pid/status output including the new Rss* fields:

VmPeak:  2001008 kB
VmSize:  2001004 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:      5108 kB
VmRSS:      5108 kB
RssAnon:              92 kB
RssFile:            1324 kB
RssShmem:           3692 kB
VmData:      192 kB
VmStk:       136 kB
VmExe:         4 kB
VmLib:      1784 kB
VmPTE:      3928 kB
VmPMD:        20 kB
VmSwap:        0 kB
HugetlbPages:          0 kB

[vbabka@suse.cz: forward-porting, tweak changelog]
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14 16:00:49 -08:00
Vlastimil Babka c261e7d94f mm, proc: account for shmem swap in /proc/pid/smaps
Currently, /proc/pid/smaps will always show "Swap: 0 kB" for
shmem-backed mappings, even if the mapped portion does contain pages
that were swapped out.  This is because unlike private anonymous
mappings, shmem does not change pte to swap entry, but pte_none when
swapping the page out.  In the smaps page walk, such page thus looks
like it was never faulted in.

This patch changes smaps_pte_entry() to determine the swap status for
such pte_none entries for shmem mappings, similarly to how
mincore_page() does it.  Swapped out shmem pages are thus accounted for.
For private mappings of tmpfs files that COWed some of the pages, swaped
out status of the original shmem pages is naturally ignored.  If some of
the private copies was also swapped out, they are accounted via their
page table swap entries, so the resulting reported swap usage is then a
sum of both swapped out private copies, and swapped out shmem pages that
were not COWed.  No double accounting can thus happen.

The accounting is arguably still not as precise as for private anonymous
mappings, since now we will count also pages that the process in
question never accessed, but another process populated them and then let
them become swapped out.  I believe it is still less confusing and
subtle than not showing any swap usage by shmem mappings at all.
Swapped out counter might of interest of users who would like to prevent
from future swapins during performance critical operation and pre-fault
them at their convenience.  Especially for larger swapped out regions
the cost of swapin is much higher than a fresh page allocation.  So a
differentiation between pte_none vs.  swapped out is important for those
usecases.

One downside of this patch is that it makes /proc/pid/smaps more
expensive for shmem mappings, as we consult the radix tree for each
pte_none entry, so the overal complexity is O(n*log(n)).  I have
measured this on a process that creates a 2GB mapping and dirties single
pages with a stride of 2MB, and time how long does it take to cat
/proc/pid/smaps of this process 100 times.

Private anonymous mapping:

real    0m0.949s
user    0m0.116s
sys     0m0.348s

Mapping of a /dev/shm/file:

real    0m3.831s
user    0m0.180s
sys     0m3.212s

The difference is rather substantial, so the next patch will reduce the
cost for shared or read-only mappings.

In a less controlled experiment, I've gathered pids of processes on my
desktop that have either '/dev/shm/*' or 'SYSV*' in smaps.  This
included the Chrome browser and some KDE processes.  Again, I've run cat
/proc/pid/smaps on each 100 times.

Before this patch:

real    0m9.050s
user    0m0.518s
sys     0m8.066s

After this patch:

real    0m9.221s
user    0m0.541s
sys     0m8.187s

This suggests low impact on average systems.

Note that this patch doesn't attempt to adjust the SwapPss field for
shmem mappings, which would need extra work to determine who else could
have the pages mapped.  Thus the value stays zero except for COWed
swapped out pages in a shmem mapping, which are accounted as usual.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14 16:00:49 -08:00
Vlastimil Babka bf9683d699 mm, documentation: clarify /proc/pid/status VmSwap limitations for shmem
This series is based on Jerome Marchand's [1] so let me quote the first
paragraph from there:

There are several shortcomings with the accounting of shared memory
(sysV shm, shared anonymous mapping, mapping to a tmpfs file).  The
values in /proc/<pid>/status and statm don't allow to distinguish
between shmem memory and a shared mapping to a regular file, even though
their implications on memory usage are quite different: at reclaim, file
mapping can be dropped or written back on disk while shmem needs a place
in swap.  As for shmem pages that are swapped-out or in swap cache, they
aren't accounted at all.

The original motivation for myself is that a customer found (IMHO
rightfully) confusing that e.g.  top output for process swap usage is
unreliable with respect to swapped out shmem pages, which are not
accounted for.

The fundamental difference between private anonymous and shmem pages is
that the latter has PTE's converted to pte_none, and not swapents.  As
such, they are not accounted to the number of swapents visible e.g.  in
/proc/pid/status VmSwap row.  It might be theoretically possible to use
swapents when swapping out shmem (without extra cost, as one has to
change all mappers anyway), and on swap in only convert the swapent for
the faulting process, leaving swapents in other processes until they
also fault (so again no extra cost).  But I don't know how many
assumptions this would break, and it would be too disruptive change for
a relatively small benefit.

Instead, my approach is to document the limitation of VmSwap, and
provide means to determine the swap usage for shmem areas for those who
are interested and willing to pay the price, using /proc/pid/smaps.
Because outside of ipcs, I don't think it's possible to currently to
determine the usage at all.  The previous patchset [1] did introduce new
shmem-specific fields into smaps output, and functions to determine the
values.  I take a simpler approach, noting that smaps output already has
a "Swap: X kB" line, where currently X == 0 always for shmem areas.  I
think we can just consider this a bug and provide the proper value by
consulting the radix tree, as e.g.  mincore_page() does.  In the patch
changelog I explain why this is also not perfect (and cannot be without
swapents), but still arguably much better than showing a 0.

The last two patches are adapted from Jerome's patchset and provide a
VmRSS breakdown to RssAnon, RssFile and RssShm in /proc/pid/status.
Hugh noted that this is a welcome addition, and I agree that it might
help e.g.  debugging process memory usage at albeit non-zero, but still
rather low cost of extra per-mm counter and some page flag checks.

[1] http://lwn.net/Articles/611966/

This patch (of 6):

The documentation for /proc/pid/status does not mention that the value
of VmSwap counts only swapped out anonymous private pages, and not
swapped out pages of the underlying shmem objects (for shmem mappings).
This is not obvious, so document this limitation.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14 16:00:49 -08:00
Al Viro e8ecde25f5 Make sure that highmem pages are not added to symlink page cache
inode_nohighmem() is sufficient to make sure that page_get_link()
won't try to allocate a highmem page.  Moreover, it is sufficient
to make sure that page_symlink/__page_symlink won't do the same
thing.  However, any filesystem that manually preseeds the symlink's
page cache upon symlink(2) needs to make sure that the page it
inserts there won't be a highmem one.

Fortunately, only nfs and shmem have run afoul of that...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-14 17:56:54 -05:00
SeongJae Park ceec86ec06 Documentation: update libhugetlbfs site url
The site for libhugetlbfs has moved from sourceforge to github. This
commit updates the old url.

Signed-off-by: SeongJae Park <sj38.park@gmail.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2016-01-14 13:27:48 -07:00
Linus Torvalds f9a03ae123 Merge tag 'for-f2fs-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
 "This series adds two ioctls to control cached data and fragmented
  files.  Most of the rest fixes missing error cases and bugs that we
  have not covered so far.  Summary:

  Enhancements:
   - support an ioctl to execute online file defragmentation
   - support an ioctl to flush cached data
   - speed up shrinking of extent_cache entries
   - handle broken superblock
   - refector dirty inode management infra
   - revisit f2fs_map_blocks to handle more cases
   - reduce global lock coverage
   - add detecting user's idle time

  Major bug fixes:
   - fix data race condition on cached nat entries
   - fix error cases of volatile and atomic writes"

* tag 'for-f2fs-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (87 commits)
  f2fs: should unset atomic flag after successful commit
  f2fs: fix wrong memory condition check
  f2fs: monitor the number of background checkpoint
  f2fs: detect idle time depending on user behavior
  f2fs: introduce time and interval facility
  f2fs: skip releasing nodes in chindless extent tree
  f2fs: use atomic type for node count in extent tree
  f2fs: recognize encrypted data in f2fs_fiemap
  f2fs: clean up f2fs_balance_fs
  f2fs: remove redundant calls
  f2fs: avoid unnecessary f2fs_balance_fs calls
  f2fs: check the page status filled from disk
  f2fs: introduce __get_node_page to reuse common code
  f2fs: check node id earily when readaheading node page
  f2fs: read isize while holding i_mutex in fiemap
  Revert "f2fs: check the node block address of newly allocated nid"
  f2fs: cover more area with nat_tree_lock
  f2fs: introduce max_file_blocks in sbi
  f2fs crypto: check CONFIG_F2FS_FS_XATTR for encrypted symlink
  f2fs: introduce zombie list for fast shrinking extent trees
  ...
2016-01-13 21:01:44 -08:00
Mike Marshall fcac9d5715 Orangefs: add protocol information to Documentation/filesystems/orangefs.txt
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-01-13 14:28:13 -05:00
Linus Torvalds 420d12d6ad Configfs changes for the 4.5 merge window:
- I'm assisting Joel as co-maintainer and patch monkey now, and you will
    see pull reuquests from me for a while
  - Besides the MAINTAINERS update there is just a single change, which
    adds support for binary attributes to configfs, which are very similar
    to the sysfs binary attributes.  Thanks to Pantelis Antoniou!
  - You will see another actually bigger set of configfs changes in the
    SCSI target pull from Nic - those were merged before this new tree
    even existed
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJWlVY2AAoJEA+eU2VSBFGD7usQAMtzTCqxKH9imqMle0434tE0
 OSY/CEi8sE+0m+u0p0rszi9G3Mneyx6Gsm1enuNUfsKDMTkdhTggVbU1IaLzGddg
 ZJbjQwJXgSC4bRO+I1AgCXlJmSLVRWhw+adUqqtKuL5L8tD08vUCXz+52nQOIlh2
 uP1PYz4Y5JyWQDviv0S6pXN53yf5y60P4eE6ghfnP9VvaD7fwIYIxMfxUQvKsvaS
 z6Aab/XrfkmoQK2A8y21LA4nHA6TS5Aa7IrqCp7G3Nks/An1qZIMU8bZpc+HfpX+
 lRd6TU4mHhabs3cOJBELV7ZxCotTSh5XhHyQaDBMo3BDCxl7UNGdY8+V7Xx5xK6D
 RHdv9g+ObUotAAfd3NXCV1fiUzUaeInRmolvLYr/o+ftPXVL7C80HerAXZHeiQXx
 u9VEkb6tTcLLoxAoIPTJ6CogSWfvYTpocoSmaLaJspKQ8859BoYsgDCjkGQJOnZP
 hngof5adafoQEhOmn6xi8NrWD906HhOHS0n1gwC4ho65yZbqf13gA+ATsavM5oX4
 3rwJgY4xeQJ1YN7Q89QgdO9ecrDod+Yq2HVkfj/LeMuXJKpCMkbYda6Qyg0t4jpF
 XLKLfXF/3BhzOvnLwh6pwQxV9F4HwBS3lw8fzw7eNua3pRoyQWGmzuFRoGA5HW5k
 L3KheVPNNYKpyeAKmoDF
 =FjkZ
 -----END PGP SIGNATURE-----

Merge tag 'configfs-for-linus' of git://git.infradead.org/users/hch/configfs

Pull configfs updates from Christoph Hellwig:
 "I'm assisting Joel as co-maintainer and patch monkey now, and you will
  see pull reuquests from me for a while.

  Besides the MAINTAINERS update there is just a single change, which
  adds support for binary attributes to configfs, which are very similar
  to the sysfs binary attributes.  Thanks to Pantelis Antoniou!

  You will see another actually bigger set of configfs changes in the
  SCSI target pull from Nic - those were merged before this new tree
  even existed"

* tag 'configfs-for-linus' of git://git.infradead.org/users/hch/configfs:
  configfs: add myself as co-maintainer, updated git tree
  configfs: implement binary attributes
2016-01-12 18:15:34 -08:00
Pantelis Antoniou 03607ace80 configfs: implement binary attributes
ConfigFS lacked binary attributes up until now. This patch
introduces support for binary attributes in a somewhat similar
manner of sysfs binary attributes albeit with changes that
fit the configfs usage model.

Problems that configfs binary attributes fix are everything that
requires a binary blob as part of the configuration of a resource,
such as bitstream loading for FPGAs, DTBs for dynamically created
devices etc.

Look at Documentation/filesystems/configfs/configfs.txt for internals
and howto use them.

This patch is against linux-next as of today that contains
Christoph's configfs rework.

Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
[hch: folded a fix from Geert Uytterhoeven <geert+renesas@glider.be>]
[hch: a few tiny updates based on review feedback]
Signed-off-by: Christoph Hellwig <hch@lst.de>
2016-01-04 12:31:46 +01:00
Al Viro fceef393a5 switch ->get_link() to delayed_call, kill ->put_link()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-30 13:01:03 -05:00
Chao Yu 343f40f0a7 f2fs: introduce new option for controlling data flush
Add a new option 'data_flush' to enable data flush functionality.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-12-16 09:25:48 -08:00
Masanari Iida 7acccdbc4d Doc: treewide: Fix grammar "a" to "an"
This patch fix some grammar mistake.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2015-12-10 11:36:40 -07:00
Al Viro 6b2553918d replace ->follow_link() with new method that could stay in RCU mode
new method: ->get_link(); replacement of ->follow_link().  The differences
are:
	* inode and dentry are passed separately
	* might be called both in RCU and non-RCU mode;
the former is indicated by passing it a NULL dentry.
	* when called that way it isn't allowed to block
and should return ERR_PTR(-ECHILD) if it needs to be called
in non-RCU mode.

It's a flagday change - the old method is gone, all in-tree instances
converted.  Conversion isn't hard; said that, so far very few instances
do not immediately bail out when called in RCU mode.  That'll change
in the next commits.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-08 22:41:54 -05:00
Al Viro 21fc61c73c don't put symlink bodies in pagecache into highmem
kmap() in page_follow_link_light() needed to go - allowing to hold
an arbitrary number of kmaps for long is a great way to deadlocking
the system.

new helper (inode_nohighmem(inode)) needs to be used for pagecache
symlinks inodes; done for all in-tree cases.  page_follow_link_light()
instrumented to yell about anything missed.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-08 22:41:36 -05:00
Masanari Iida 4bb9998d38 Doc: f2fs: Fix typos in Documentation/filesystems/f2fs.txt
This patch fix some typos in Documentation/filesystems/f2fs.txt

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-12-04 11:52:33 -08:00
Mike Marshall a52079dad4 Linux 4.4-rc1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWSSq2AAoJEHm+PkMAQRiGIpQH/2Ro7UPwLARfZyd7GKpCRd/D
 y1aJEk9GaEywkGKPd/Hf3XU+/9vxqyTWhkvpKHkGAFqzuYvw2rSE1JZnGfA+dyt/
 1R/WSKM8f3VL2xmNmRJCXH+zn3W2u8Cvk0oj8VN0hseXByQfCT4HfoHiFrNwuo/9
 63f93Dqsai1G3CBy+/HSPEv5CGR+7yeTJFWS0hQK4kc3prmitI93C5PZkyePyAId
 wrgKo320rIcNL/lEAbMFEBimAqluIjGvdrR9FK6w3F+OwSlTV1HdSwTzNjxwdpP5
 aRKNHZMdsGlShsn12Wsp+mjVFpi9OrT8Askq+/GTDsG+9QMia/gUKC21K9b6UxM=
 =byS9
 -----END PGP SIGNATURE-----

Orangefs: Merge tag 'v4.4-rc1' into for-next

Linux 4.4-rc1
2015-11-16 10:58:57 -05:00
Linus Torvalds 9aa3d651a9 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "This series contains HCH's changes to absorb configfs attribute
  ->show() + ->store() function pointer usage from it's original
  tree-wide consumers, into common configfs code.

  It includes usb-gadget, target w/ drivers, netconsole and ocfs2
  changes to realize the improved simplicity, that now renders the
  original include/target/configfs_macros.h CPP magic for fabric drivers
  and others, unnecessary and obsolete.

  And with common code in place, new configfs attributes can be added
  easier than ever before.

  Note, there are further improvements in-flight from other folks for
  v4.5 code in configfs land, plus number of target fixes for post -rc1
  code"

In the meantime, a new user of the now-removed old configfs API came in
through the char/misc tree in commit 7bd1d4093c ("stm class: Introduce
an abstraction for System Trace Module devices").

This merge resolution comes from Alexander Shishkin, who updated his stm
class tracing abstraction to account for the removal of the old
show_attribute and store_attribute methods in commit 517982229f
("configfs: remove old API") from this pull.  As Alexander says about
that patch:

 "There's no need to keep an extra wrapper structure per item and the
  awkward show_attribute/store_attribute item ops are no longer needed.

  This patch converts policy code to the new api, all the while making
  the code quite a bit smaller and easier on the eyes.

  Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>"

That patch was folded into the merge so that the tree should be fully
bisectable.

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (23 commits)
  configfs: remove old API
  ocfs2/cluster: use per-attribute show and store methods
  ocfs2/cluster: move locking into attribute store methods
  netconsole: use per-attribute show and store methods
  target: use per-attribute show and store methods
  spear13xx_pcie_gadget: use per-attribute show and store methods
  dlm: use per-attribute show and store methods
  usb-gadget/f_serial: use per-attribute show and store methods
  usb-gadget/f_phonet: use per-attribute show and store methods
  usb-gadget/f_obex: use per-attribute show and store methods
  usb-gadget/f_uac2: use per-attribute show and store methods
  usb-gadget/f_uac1: use per-attribute show and store methods
  usb-gadget/f_mass_storage: use per-attribute show and store methods
  usb-gadget/f_sourcesink: use per-attribute show and store methods
  usb-gadget/f_printer: use per-attribute show and store methods
  usb-gadget/f_midi: use per-attribute show and store methods
  usb-gadget/f_loopback: use per-attribute show and store methods
  usb-gadget/ether: use per-attribute show and store methods
  usb-gadget/f_acm: use per-attribute show and store methods
  usb-gadget/f_hid: use per-attribute show and store methods
  ...
2015-11-13 20:04:17 -08:00
Linus Torvalds 4aeabc6b5c A few more documentation patches that wandered in and have no reason to
wait; these include some improvements to the suggestions for email clients
 and patch submission.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWRge3AAoJEI3ONVYwIuV6BboP/30p1d0kGC9KWH96aVZst1uu
 4VaUcOf+zKp8wQwhyHQIgmGlgD8u6fzfa8i2YIiVRzJD18Tm67KjHsbSiT4qgI84
 xizcmnuRogTthtWTmGITKfgQ5OL3Z0IX/5JgIMoIdmAKSjg/3kSR2b5FvN+tj6Qh
 9SQAWYIW2cvDWp9PV8gWOdA2j6EKDQ6BdwRE749LmS3MXRN9bM/KRezCyIrmCvOF
 FChTzxVjKd2TNYlO9nDyzn70WhyoV/e7jn+9AOKC4XHOCrFI0KB36mE8Wv/VmUmF
 +6YPLWNJrTb7J07p9fkYb0FeMzEVcZIVqtMgvtmWTrX4qat9VJI38tOcgDQKHOhj
 ETf+8DDM0t4pNUw+3KDdCdu+AIZcEuuRQYc7AUC/URHZGM+LsI0hERRcNS5Z/KRv
 lKyq3Y+A+/Z7b6Ia87lJavgubEdY3pagCOZXWQWvb8CwaTnOhuf4qfBK6pkqJBuN
 2DfNSQrsi2t1cqKfrnGr58kkMslOKzZLXXSjgvB8IhHmqedNha9JODykq2ldnMdx
 Lch1fIe8Exh+p0pY3nKUnmB69dMZnuNVnW92C4JbNr+zgU1jxQeLVer3bum3NzZw
 cEASmIPb1nJnl5aJkapc/rz+tSJ9GIIgaTyarNO+cVWQ712gYU5Ot4nNj5ayKW5C
 RgQOqBL+68GF1tMUB61b
 =6qJ0
 -----END PGP SIGNATURE-----

Merge tag '4.4-additional' of git://git.lwn.net/linux

Pull more documentation updates from Jon Corbet:
 "A few more documentation patches that wandered in and have no reason
  to wait; these include some improvements to the suggestions for email
  clients and patch submission"

* tag '4.4-additional' of git://git.lwn.net/linux:
  Documentation: Add minimal Mutt config for using Gmail
  Documentation: Add note on sending files directly with Mutt
  Documentation: dontdiff: remove media from dontdiff
  Documentation/SubmittingPatches: discuss In-Reply-To
  Remove email address from Documentation/filesystems/overlayfs.txt
  can-doc: Add missing semicolon to example
2015-11-13 09:19:05 -08:00
NeilBrown a907c90765 Remove email address from Documentation/filesystems/overlayfs.txt
I'm getting a surprising large number of questions about overlayfs sent
to me personally, rather than to a relevant mailing list.

So remove my email address from the documentation, and add a note
about looking in the MAINTAINERS file.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2015-11-11 10:04:53 -07:00
Linus Torvalds 264015f8a8 libnvdimm for 4.4:
1/ Add support for the ACPI 6.0 NFIT hot add mechanism to process
    updates of the NFIT at runtime.
 
 2/ Teach the coredump implementation how to filter out DAX mappings.
 
 3/ Introduce NUMA hints for allocations made by the pmem driver, and as
    a side effect all devm allocations now hint their NUMA node by
    default.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWQX2sAAoJEB7SkWpmfYgCWsEQAK7w/xM9zClVY/DDlFJxFtYq
 DZJ4faPj+E3FMTiJIEDzjtRgQvOFE+wmJtntYsCqKH/QZmpnyk9jeT/CbJzEEL2k
 WsAk+qHGLcVUlSb36blwN1RFzYqC+IDYThewJqUvxDbOwL1AbiibbX7gplzZHLhW
 +rj3ScVlSNOPRDgGGpkAeLNNsttuKvsGo7nB/JZopm0tV6g14rSK09wQbVhv6S6T
 Lu7VGYqnJlkteL9YlzRiROf9hW2ZFCMGJz1YZydPTy3aX3hGTBX4w2qvmsPwBIKP
 kW/gCNisVJGk1cZCk4joSJ8i/b3x3fE0zdZ5waivJ5jDvYbUUfyk0KtJkfw207Rl
 14yWitUC6aeVuCeOqXHgsjRi+1QVN9Pg7i49xgGiUN1igQiUYRTgQPWZxDv6Zo/s
 USrLFQBaRd+hJw+dl7A47lJ3mUF96tPCoQb4LCQ7DVsg5U4J2TvqXLH9Gek/CCZ4
 QsMkZDTQlZw4+JEDlzBgg/L7xVty8DadplTADMdjaRhFU3y8zKNJ85Ileokt7KVt
 IsBT4+S5HeZLvinZY95932DwAmFp1DtsyENd1BUXL06ddyvlQrFJ6NQaXji4fuDc
 EVQmMoTAqDujZFupMAux9vkUBDFj/hmaVD5F7j3+MWP87OCritw/IZn+2LgTaKoX
 EmttaYrDr2jJwIaGyw+H
 =a2/L
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:
 "Outside of the new ACPI-NFIT hot-add support this pull request is more
  notable for what it does not contain, than what it does.  There were a
  handful of development topics this cycle, dax get_user_pages, dax
  fsync, and raw block dax, that need more more iteration and will wait
  for 4.5.

  The patches to make devm and the pmem driver NUMA aware have been in
  -next for several weeks.  The hot-add support has not, but is
  contained to the NFIT driver and is passing unit tests.  The coredump
  support is straightforward and was looked over by Jeff.  All of it has
  received a 0day build success notification across 107 configs.

  Summary:

   - Add support for the ACPI 6.0 NFIT hot add mechanism to process
     updates of the NFIT at runtime.

   - Teach the coredump implementation how to filter out DAX mappings.

   - Introduce NUMA hints for allocations made by the pmem driver, and
     as a side effect all devm allocations now hint their NUMA node by
     default"

* tag 'libnvdimm-for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  coredump: add DAX filtering for FDPIC ELF coredumps
  coredump: add DAX filtering for ELF coredumps
  acpi: nfit: Add support for hot-add
  nfit: in acpi_nfit_init, break on a 0-length table
  pmem, memremap: convert to numa aware allocations
  devm_memremap_pages: use numa_mem_id
  devm: make allocations numa aware by default
  devm_memremap: convert to return ERR_PTR
  devm_memunmap: use devres_release()
  pmem: kill memremap_pmem()
  x86, mm: quiet arch_add_memory()
2015-11-10 12:07:22 -08:00
Linus Torvalds 9d74288ca7 GFS2: merge window
Here is a list of patches we've accumulated for GFS2 for the current upstream
 merge window. There are only six patches this time:
 
 1. A cleanup patch from Andreas to remove the gl_spin #define in favor
    of its value for the sake of clarity.
 2. A fix from Andy Price to mark the inode dirty during fallocate.
 3. A fix from Andy Price to set s_mode on mount failures to prevent
    a stack trace.
 4. A patch from me to prevent a kernel BUG() in trans_add_meta/trans_add_data
    due to uninitialized storage.
 5. A patch from me to protecting our freeing of the in-core directory
    hash table to prevent double-free.
 6. A fix for a page/block rounding problem that resulted in a metadata
    coherency problem when the block size != page size.
 
 I've got a lot more patches in various stages of review and testing,
 but I'm afraid they'll have to wait until the next merge window. So
 next time we're likely to have a lot more.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWQMSyAAoJENeLYdPf93o7k+EH/inFFqkYxLCyXZngihTHdZvS
 tYAwYxPJw6UgSrZ1dY6iwcmhy6YgT7a98RJdPA3Kj0SvJxQVBiJ5uc0VKpK0bj72
 l7pVPkMEWCHs8u8RAIGfnik8y6IxOP35+EN0U/3ZLMG1Gc+Tmq9M8KLlnhfX980q
 oaniaDJAUaSSW8RxD2AKabxjoJ0DKnE6MtDHsL/JWhp1j5co5BbbwOzBmBa2mLCI
 RQ8YEvjqjtgm91g33pkxXJVMjAkFqLjRSVfomd5MSQWRUb+eGIpd3LFThHzVfm55
 2f4j2kd2V4i7rTrh8Q3RdoYRoFgpXyxXQ3R2UYL59b7B2DvGTyAKyDxfU237ZXQ=
 =yanT
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 updates from Bob Peterson:
 "Here is a list of patches we've accumulated for GFS2 for the current
  upstream merge window.  There are only six patches this time:

   1. A cleanup patch from Andreas to remove the gl_spin #define in favor
      of its value for the sake of clarity.
   2. A fix from Andy Price to mark the inode dirty during fallocate.
   3. A fix from Andy Price to set s_mode on mount failures to prevent a
      stack trace.
   4  A patch from me to prevent a kernel BUG() in trans_add_meta/trans_add_data
      due to uninitialized storage.
   5. A patch from me to protecting our freeing of the in-core directory
      hash table to prevent double-free.
   6. A fix for a page/block rounding problem that resulted in a metadata
      coherency problem when the block size != page size"

  I've got a lot more patches in various stages of review and testing,
  but I'm afraid they'll have to wait until the next merge window.  So
  next time we're likely to have a lot more"

* tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  GFS2: Fix rgrp end rounding problem for bsize < page size
  GFS2: Protect freeing directory hash table with i_lock spin_lock
  gfs2: Remove gl_spin define
  gfs2: Add missing else in trans_add_meta/data
  GFS2: Set s_mode before parsing mount options
  GFS2: fallocate: do not rely on file_update_time to mark the inode dirty
2015-11-09 18:01:23 -08:00
Ross Zwisler 5037835c1f coredump: add DAX filtering for ELF coredumps
Add two new flags to the existing coredump mechanism for ELF files to
allow us to explicitly filter DAX mappings.  This is desirable because
DAX mappings, like hugetlb mappings, have the potential to be very
large.

Update the coredump_filter documentation in
Documentation/filesystems/proc.txt so that it addresses the new DAX
coredump flags.  Also update the documented default value of
coredump_filter to be consistent with the core(5) man page.  The
documentation being updated talks about bit 4, Dump ELF headers, which
is enabled if CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is turned on in the
kernel config.  This kernel config option defaults to "y" if both ELF
binaries and coredump are enabled.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-11-09 13:29:54 -05:00
Linus Torvalds 2e3078af2c Merge branch 'akpm' (patches from Andrew)
Merge patch-bomb from Andrew Morton:

 - inotify tweaks

 - some ocfs2 updates (many more are awaiting review)

 - various misc bits

 - kernel/watchdog.c updates

 - Some of mm.  I have a huge number of MM patches this time and quite a
   lot of it is quite difficult and much will be held over to next time.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (162 commits)
  selftests: vm: add tests for lock on fault
  mm: mlock: add mlock flags to enable VM_LOCKONFAULT usage
  mm: introduce VM_LOCKONFAULT
  mm: mlock: add new mlock system call
  mm: mlock: refactor mlock, munlock, and munlockall code
  kasan: always taint kernel on report
  mm, slub, kasan: enable user tracking by default with KASAN=y
  kasan: use IS_ALIGNED in memory_is_poisoned_8()
  kasan: Fix a type conversion error
  lib: test_kasan: add some testcases
  kasan: update reference to kasan prototype repo
  kasan: move KASAN_SANITIZE in arch/x86/boot/Makefile
  kasan: various fixes in documentation
  kasan: update log messages
  kasan: accurately determine the type of the bad access
  kasan: update reported bug types for kernel memory accesses
  kasan: update reported bug types for not user nor kernel memory accesses
  mm/kasan: prevent deadlock in kasan reporting
  mm/kasan: don't use kasan shadow pointer in generic functions
  mm/kasan: MODULE_VADDR is not available on all archs
  ...
2015-11-05 23:10:54 -08:00
Hugh Dickins a5be35632c Documentation/filesystems/proc.txt: a little tidying
There's an odd line about "Locked" at the head of the description of
/proc/meminfo: it seems to have strayed from /proc/PID/smaps, so lead it
back there.  Move "Swap" and "SwapPss" descriptions down above it, to
match the order in the file (though "PageSize"s still undescribed).

The example of "Locked: 374 kB" (the same as Pss, neither Rss nor Size) is
so unlikely as to be misleading: just make it 0, this is /bin/bash text;
which would be "dw" (disabled write) not "de" (do not expand).

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05 19:34:48 -08:00
Hugh Dickins 7a14239a8f mm Documentation: undoc non-linear vmas
While updating some mm Documentation, I came across a few straggling
references to the non-linear vmas which were happily removed in v4.0.
Delete them.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05 19:34:48 -08:00
Naoya Horiguchi 5d317b2b65 mm: hugetlb: proc: add HugetlbPages field to /proc/PID/status
Currently there's no easy way to get per-process usage of hugetlb pages,
which is inconvenient because userspace applications which use hugetlb
typically want to control their processes on the basis of how much memory
(including hugetlb) they use.  So this patch simply provides easy access
to the info via /proc/PID/status.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Joern Engel <joern@logfs.org>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05 19:34:48 -08:00
Naoya Horiguchi 25ee01a2fc mm: hugetlb: proc: add hugetlb-related fields to /proc/PID/smaps
Currently /proc/PID/smaps provides no usage info for vma(VM_HUGETLB),
which is inconvenient when we want to know per-task or per-vma base
hugetlb usage.  To solve this, this patch adds new fields for hugetlb
usage like below:

  Size:              20480 kB
  Rss:                   0 kB
  Pss:                   0 kB
  Shared_Clean:          0 kB
  Shared_Dirty:          0 kB
  Private_Clean:         0 kB
  Private_Dirty:         0 kB
  Referenced:            0 kB
  Anonymous:             0 kB
  AnonHugePages:         0 kB
  Shared_Hugetlb:    18432 kB
  Private_Hugetlb:    2048 kB
  Swap:                  0 kB
  KernelPageSize:     2048 kB
  MMUPageSize:        2048 kB
  Locked:                0 kB
  VmFlags: rd wr mr mw me de ht

[hughd@google.com: fix Private_Hugetlb alignment ]
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Joern Engel <joern@logfs.org>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05 19:34:48 -08:00
Linus Torvalds 5ebe0ee802 There is a nice new document from Neil on how pathname lookups work and
some new CAN driver documentation.  Beyond that, we have kernel-doc fixes,
 a bit more work to support reproducible builds, and the usual collection of
 small fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWO6HiAAoJEI3ONVYwIuV6ihwQAK0KC72h0706bdwDJ1p1/aJU
 QLuPeiKYWgGAXq2zgOyw3Povj4bkMwkiq1IGHLyK0Id4tg3ngxOXjimk4YKrqarI
 BD5HdpOm7IyQEe66ZU9b1RFDVst+bg3yp6ZIZsH5vQxl/KnyJ6AyaaDk8TPYId8S
 1+CykJzxyi7GyT/jlLpHbKtBKrraoVke+cNPMAvOf0NjSyO7Ix5B+qH50sttG6Eu
 9qcQ8hlKXOdZRTiGW6P+jeZNA+e5+CRpnG9VHBquHy4lI85kQThhWq41UMH690PP
 eRbLipeUybb0FwW2KwuMjGKEMDkMvrGJh0TzSXX9lGHd+5/41v7zcyKh8vJcpLjh
 bNQ2WOAKUBd2d15EP1MNoKXDLGJXusJczLwOjigWiSCQvgouAWwMrpWEw+Obv8Yl
 rdoH1oQqDFfDnk6mnKrSaqLWGNuLxDtkEl/1P0jsGSK6lM3FDkOgTuNPYXTJJgxN
 rXuGmPhyUlS2srERUeQJw2rISN0WRBvcKJGkMX6IpvrXHkItbelqK+yY1DeKPmcm
 qgbIx9ZWNqtltFpG22VVByqAVwucO5Nu8cAIQ2ysJsTnKOvQCQmhu5UKTjBCkEJM
 VpeMm32BfNiJFLuLTQGWBZ8bkRl2shQyXhOaR3uyqG4T+rpPD3qJi6dtFRpsAzOB
 q1nZuJCpOaxJFzjSKvpJ
 =emZ7
 -----END PGP SIGNATURE-----

Merge tag 'docs-for-linus' of git://git.lwn.net/linux

Pull documentation update from Jon Corbet:
 "There is a nice new document from Neil on how pathname lookups work
  and some new CAN driver documentation.  Beyond that, we have
  kernel-doc fixes, a bit more work to support reproducible builds, and
  the usual collection of small fixes"

* tag 'docs-for-linus' of git://git.lwn.net/linux: (34 commits)
  Documentation: add new description of path-name lookup.
  Documentation/vm/slub.txt: document slabinfo-gnuplot.sh
  Doc: ABI/stable: Fix typo in ABI/stable
  doc: Clarify that nmi_watchdog param is for hardlockups
  Typo correction for description in gpio document.
  DocBook: Fix kernel-doc to be case-insensitive for private:
  kernel-docs.txt: update kernelnewbies reference
  Doc:kvm: Fix typo in Doc/virtual/kvm
  Documentation/Changes: Add bc in "Current Minimal Requirements" section
  Documentation/email-clients.txt: remove trailing whitespace
  DocBook: Use a fixed encoding for output
  MAINTAINERS: The docs tree has moved
  Docs/kernel-parameters: Add earlycon devicetree usage
  SubmittingPatches: make Subject examples match the de facto standard
  Documentation: gpio: mention that <function>-gpio has been deprecated
  Documentation: cgroups: just fix a few typos
  Documentation: Update kselftest.txt
  Documentation: DMA API: Be more explicit that nents is always the same
  Documentation: Update the default value of crashkernel low
  zram: update documentation
  ...
2015-11-05 15:59:24 -08:00
Linus Torvalds 0fcb9d21b4 Merge tag 'for-f2fs-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
 "Most part of the patches include enhancing the stability and
  performance of in-memory extent caches feature.

  In addition, it introduces several new features and configurable
  points:
   - F2FS_GOING_DOWN_METAFLUSH ioctl to test power failures
   - F2FS_IOC_WRITE_CHECKPOINT ioctl to trigger checkpoint by users
   - background_gc=sync mount option to do gc synchronously
   - periodic checkpoints
   - sysfs entry to control readahead blocks for free nids

  And the following bug fixes have been merged.
   - fix SSA corruption by collapse/insert_range
   - correct a couple of gc behaviors
   - fix the results of f2fs_map_blocks
   - fix error case handling of volatile/atomic writes"

* tag 'for-f2fs-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (54 commits)
  f2fs: fix to skip shrinking extent nodes
  f2fs: fix error path of ->symlink
  f2fs: fix to clear GCed flag for atomic written page
  f2fs: don't need to submit bio on error case
  f2fs: fix leakage of inmemory atomic pages
  f2fs: refactor __find_rev_next_{zero}_bit
  f2fs: support fiemap for inline_data
  f2fs: flush dirty data for bmap
  f2fs: relocate the tracepoint for background_gc
  f2fs crypto: fix racing of accessing encrypted page among
  f2fs: export ra_nid_pages to sysfs
  f2fs: readahead for free nids building
  f2fs: support lower priority asynchronous readahead in ra_meta_pages
  f2fs: don't tag REQ_META for temporary non-meta pages
  f2fs: add a tracepoint for f2fs_read_data_pages
  f2fs: set GFP_NOFS for grab_cache_page
  f2fs: fix SSA updates resulting in corruption
  Revert "f2fs: do not skip dentry block writes"
  f2fs: add F2FS_GOING_DOWN_METAFLUSH to test power-failure
  f2fs: merge meta writes as many possible
  ...
2015-11-05 11:22:07 -08:00
Linus Torvalds e880e87488 driver core update for 4.4-rc1
Here's the "big" driver core updates for 4.4-rc1.  Primarily a bunch of
 debugfs updates, with a smattering of minor driver core fixes and
 updates as well.
 
 All have been in linux-next for a long time.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlY6ePQACgkQMUfUDdst+ymNTgCgpP0CZw57GpwF/Hp2L/lMkVeo
 Kx8AoKhEi4iqD5fdCQS9qTfomB+2/M6g
 =g7ZO
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-4.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here's the "big" driver core updates for 4.4-rc1.  Primarily a bunch
  of debugfs updates, with a smattering of minor driver core fixes and
  updates as well.

  All have been in linux-next for a long time"

* tag 'driver-core-4.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  debugfs: Add debugfs_create_ulong()
  of: to support binding numa node to specified device in devicetree
  debugfs: Add read-only/write-only bool file ops
  debugfs: Add read-only/write-only size_t file ops
  debugfs: Add read-only/write-only x64 file ops
  debugfs: Consolidate file mode checks in debugfs_create_*()
  Revert "mm: Check if section present during memory block (un)registering"
  driver-core: platform: Provide helpers for multi-driver modules
  mm: Check if section present during memory block (un)registering
  devres: fix a for loop bounds check
  CMA: fix CONFIG_CMA_SIZE_MBYTES overflow in 64bit
  base/platform: assert that dev_pm_domain callbacks are called unconditionally
  sysfs: correctly handle short reads on PREALLOC attrs.
  base: soc: siplify ida usage
  kobject: move EXPORT_SYMBOL() macros next to corresponding definitions
  kobject: explain what kobject's sd field is
  debugfs: document that debugfs_remove*() accepts NULL and error values
  debugfs: Pass bool pointer to debugfs_create_bool()
  ACPI / EC: Fix broken 64bit big-endian users of 'global_lock'
2015-11-04 21:50:37 -08:00
Linus Torvalds b0f85fa11a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

Changes of note:

 1) Allow to schedule ICMP packets in IPVS, from Alex Gartrell.

 2) Provide FIB table ID in ipv4 route dumps just as ipv6 does, from
    David Ahern.

 3) Allow the user to ask for the statistics to be filtered out of
    ipv4/ipv6 address netlink dumps.  From Sowmini Varadhan.

 4) More work to pass the network namespace context around deep into
    various packet path APIs, starting with the netfilter hooks.  From
    Eric W Biederman.

 5) Add layer 2 TX/RX checksum offloading to qeth driver, from Thomas
    Richter.

 6) Use usec resolution for SYN/ACK RTTs in TCP, from Yuchung Cheng.

 7) Support Very High Throughput in wireless MESH code, from Bob
    Copeland.

 8) Allow setting the ageing_time in switchdev/rocker.  From Scott
    Feldman.

 9) Properly autoload L2TP type modules, from Stephen Hemminger.

10) Fix and enable offload features by default in 8139cp driver, from
    David Woodhouse.

11) Support both ipv4 and ipv6 sockets in a single vxlan device, from
    Jiri Benc.

12) Fix CWND limiting of thin streams in TCP, from Bendik Rønning
    Opstad.

13) Fix IPSEC flowcache overflows on large systems, from Steffen
    Klassert.

14) Convert bridging to track VLANs using rhashtable entries rather than
    a bitmap.  From Nikolay Aleksandrov.

15) Make TCP listener handling completely lockless, this is a major
    accomplishment.  Incoming request sockets now live in the
    established hash table just like any other socket too.

    From Eric Dumazet.

15) Provide more bridging attributes to netlink, from Nikolay
    Aleksandrov.

16) Use hash based algorithm for ipv4 multipath routing, this was very
    long overdue.  From Peter Nørlund.

17) Several y2038 cures, mostly avoiding timespec.  From Arnd Bergmann.

18) Allow non-root execution of EBPF programs, from Alexei Starovoitov.

19) Support SO_INCOMING_CPU as setsockopt, from Eric Dumazet.  This
    influences the port binding selection logic used by SO_REUSEPORT.

20) Add ipv6 support to VRF, from David Ahern.

21) Add support for Mellanox Spectrum switch ASIC, from Jiri Pirko.

22) Add rtl8xxxu Realtek wireless driver, from Jes Sorensen.

23) Implement RACK loss recovery in TCP, from Yuchung Cheng.

24) Support multipath routes in MPLS, from Roopa Prabhu.

25) Fix POLLOUT notification for listening sockets in AF_UNIX, from Eric
    Dumazet.

26) Add new QED Qlogic river, from Yuval Mintz, Manish Chopra, and
    Sudarsana Kalluru.

27) Don't fetch timestamps on AF_UNIX sockets, from Hannes Frederic
    Sowa.

28) Support ipv6 geneve tunnels, from John W Linville.

29) Add flood control support to switchdev layer, from Ido Schimmel.

30) Fix CHECKSUM_PARTIAL handling of potentially fragmented frames, from
    Hannes Frederic Sowa.

31) Support persistent maps and progs in bpf, from Daniel Borkmann.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1790 commits)
  sh_eth: use DMA barriers
  switchdev: respect SKIP_EOPNOTSUPP flag in case there is no recursion
  net: sched: kill dead code in sch_choke.c
  irda: Delete an unnecessary check before the function call "irlmp_unregister_service"
  net: dsa: mv88e6xxx: include DSA ports in VLANs
  net: dsa: mv88e6xxx: disable SA learning for DSA and CPU ports
  net/core: fix for_each_netdev_feature
  vlan: Invoke driver vlan hooks only if device is present
  arcnet/com20020: add LEDS_CLASS dependency
  bpf, verifier: annotate verbose printer with __printf
  dp83640: Only wait for timestamps for packets with timestamping enabled.
  ptp: Change ptp_class to a proper bitmask
  dp83640: Prune rx timestamp list before reading from it
  dp83640: Delay scheduled work.
  dp83640: Include hash in timestamp/packet matching
  ipv6: fix tunnel error handling
  net/mlx5e: Fix LSO vlan insertion
  net/mlx5e: Re-eanble client vlan TX acceleration
  net/mlx5e: Return error in case mlx5e_set_features() fails
  net/mlx5e: Don't allow more than max supported channels
  ...
2015-11-04 09:41:05 -08:00
Neil Brown 3ce96239d4 Documentation: add new description of path-name lookup.
This document is based on three recent lwn.net articles.
Some of the introductory material and linkage between articles
has been removed, and some time-based descriptions have been
revised.

Also all links to code have been removed as the code is very close by.

Contains corrections and improvements from Randy Dunlap <rdunlap@infradead.org>.

Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2015-11-02 18:18:25 -07:00
Andreas Gruenbacher f3dd164912 gfs2: Remove gl_spin define
Commit e66cf161 replaced the gl_spin spinlock in struct gfs2_glock with a
gl_lockref lockref and defined gl_spin as gl_lockref.lock (the spinlock in
gl_lockref).  Remove that define to make the references to gl_lockref.lock more
obvious.

Signed-off-by: Andreas Gruenbacher <andreas.gruenbacher@gmail.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2015-10-29 12:57:48 -05:00
Li RongQing 26fb342c73 ipconfig: send Client-identifier in DHCP requests
A dhcp server may provide parameters to a client from a pool of IP
addresses and using a shared rootfs, or provide a specific set of
parameters for a specific client, usually using the MAC address to
identify each client individually. The dhcp protocol also specifies
a client-id field which can be used to determine the correct
parameters to supply when no MAC address is available. There is
currently no way to tell the kernel to supply a specific client-id,
only the userspace dhcp clients support this feature, but this can
not be used when the network is needed before userspace is available
such as when the root filesystem is on NFS.

This patch is to be able to do something like "ip=dhcp,client_id_type,
client_id_value", as a kernel parameter to enable the kernel to
identify itself to the server.

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-18 19:23:52 -07:00
Christoph Hellwig 517982229f configfs: remove old API
Remove the old show_attribute and store_attribute methods and update
the documentation.  Also replace the two C samples with a single new
one in the proper samples directory where people expect to find it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-10-13 22:17:57 -07:00
Jaegeuk Kim 6aefd93b01 f2fs: introduce background_gc=sync mount option
This patch introduce background_gc=sync enabling synchronous cleaning in
background.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-10-09 16:20:57 -07:00
Viresh Kumar 621a5f7ad9 debugfs: Pass bool pointer to debugfs_create_bool()
Its a bit odd that debugfs_create_bool() takes 'u32 *' as an argument,
when all it needs is a boolean pointer.

It would be better to update this API to make it accept 'bool *'
instead, as that will make it more consistent and often more convenient.
Over that bool takes just a byte.

That required updates to all user sites as well, in the same commit
updating the API. regmap core was also using
debugfs_{read|write}_file_bool(), directly and variable types were
updated for that to be bool as well.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-04 11:36:07 +01:00
Mike Marshall 74a552a133 Orangefs: kernel client part 6
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2015-10-03 11:39:59 -04:00
Ingo Molnar b2f73922d1 fs/proc, core/debug: Don't expose absolute kernel addresses via wchan
So the /proc/PID/stat 'wchan' field (the 30th field, which contains
the absolute kernel address of the kernel function a task is blocked in)
leaks absolute kernel addresses to unprivileged user-space:

        seq_put_decimal_ull(m, ' ', wchan);

The absolute address might also leak via /proc/PID/wchan as well, if
KALLSYMS is turned off or if the symbol lookup fails for some reason:

static int proc_pid_wchan(struct seq_file *m, struct pid_namespace *ns,
                          struct pid *pid, struct task_struct *task)
{
        unsigned long wchan;
        char symname[KSYM_NAME_LEN];

        wchan = get_wchan(task);

        if (lookup_symbol_name(wchan, symname) < 0) {
                if (!ptrace_may_access(task, PTRACE_MODE_READ))
                        return 0;
                seq_printf(m, "%lu", wchan);
        } else {
                seq_printf(m, "%s", symname);
        }

        return 0;
}

This isn't ideal, because for example it trivially leaks the KASLR offset
to any local attacker:

  fomalhaut:~> printf "%016lx\n" $(cat /proc/$$/stat | cut -d' ' -f35)
  ffffffff8123b380

Most real-life uses of wchan are symbolic:

  ps -eo pid:10,tid:10,wchan:30,comm

and procps uses /proc/PID/wchan, not the absolute address in /proc/PID/stat:

  triton:~/tip> strace -f ps -eo pid:10,tid:10,wchan:30,comm 2>&1 | grep wchan | tail -1
  open("/proc/30833/wchan", O_RDONLY)     = 6

There's one compatibility quirk here: procps relies on whether the
absolute value is non-zero - and we can provide that functionality
by outputing "0" or "1" depending on whether the task is blocked
(whether there's a wchan address).

These days there appears to be very little legitimate reason
user-space would be interested in  the absolute address. The
absolute address is mostly historic: from the days when we
didn't have kallsyms and user-space procps had to do the
decoding itself via the System.map.

So this patch sets all numeric output to "0" or "1" and keeps only
symbolic output, in /proc/PID/wchan.

( The absolute sleep address can generally still be profiled via
  perf, by tasks with sufficient privileges. )

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: kasan-dev <kasan-dev@googlegroups.com>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20150930135917.GA3285@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-01 12:55:34 +02:00
Ulf Magnusson 17666497fe sysfs.txt: mention that store method buffers are null-terminated
Without knowing this, the use of sysfs_streq() becomes puzzling.

The termination happens in kernfs_fop_write().

Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com>
[jc: moved the new text to a different paragraph]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2015-09-13 14:38:51 -06:00
Ulf Magnusson 9ba41327d8 sysfs-tagging.txt: fix pre-kernfs references
- sysfs_dirent is now kernfs_node - see commit 324a56e16e ("kernfs:
   s/sysfs_dirent/kernfs_node/ and rename its friends accordingly")

 - sysfs_super_info is now kernfs_super_info - see commit c525aaddc3
   ("kernfs: s/sysfs/kernfs/ in various data structures")

 - the 's_' prefix was dropped from various fields - see
   commit adc5e8b58f ("kernfs: drop s_ prefix from kernfs_node members")

Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2015-09-13 14:38:51 -06:00
Ulf Magnusson 390b421c42 sysfs.txt: fix pre-kernfs sysfs_dirent reference
sysfs_dirent went away when kernfs was extracted from sysfs. The reference
to the kobject now lives in a kernfs_node (in the 'priv' member).

See commit 324a56e16e ("kernfs: s/sysfs_dirent/kernfs_node/ and rename
its friends accordingly").

Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2015-09-13 14:38:50 -06:00
Linus Torvalds f6f7a63692 Merge branch 'akpm' (patches from Andrew)
Merge second patch-bomb from Andrew Morton:
 "Almost all of the rest of MM.  There was an unusually large amount of
  MM material this time"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (141 commits)
  zpool: remove no-op module init/exit
  mm: zbud: constify the zbud_ops
  mm: zpool: constify the zpool_ops
  mm: swap: zswap: maybe_preload & refactoring
  zram: unify error reporting
  zsmalloc: remove null check from destroy_handle_cache()
  zsmalloc: do not take class lock in zs_shrinker_count()
  zsmalloc: use class->pages_per_zspage
  zsmalloc: consider ZS_ALMOST_FULL as migrate source
  zsmalloc: partial page ordering within a fullness_list
  zsmalloc: use shrinker to trigger auto-compaction
  zsmalloc: account the number of compacted pages
  zsmalloc/zram: introduce zs_pool_stats api
  zsmalloc: cosmetic compaction code adjustments
  zsmalloc: introduce zs_can_compact() function
  zsmalloc: always keep per-class stats
  zsmalloc: drop unused variable `nr_to_migrate'
  mm/memblock.c: fix comment in __next_mem_range()
  mm/page_alloc.c: fix type information of memoryless node
  memory-hotplug: fix comments in zone_spanned_pages_in_node() and zone_spanned_pages_in_node()
  ...
2015-09-08 17:52:23 -07:00
Minchan Kim 8334b96221 mm: /proc/pid/smaps:: show proportional swap share of the mapping
We want to know per-process workingset size for smart memory management
on userland and we use swap(ex, zram) heavily to maximize memory
efficiency so workingset includes swap as well as RSS.

On such system, if there are lots of shared anonymous pages, it's really
hard to figure out exactly how many each process consumes memory(ie, rss
+ wap) if the system has lots of shared anonymous memory(e.g, android).

This patch introduces SwapPss field on /proc/<pid>/smaps so we can get
more exact workingset size per process.

Bongkyu tested it. Result is below.

1. 50M used swap
SwapTotal: 461976 kB
SwapFree: 411192 kB

$ adb shell cat /proc/*/smaps | grep "SwapPss:" | awk '{sum += $2} END {print sum}';
48236
$ adb shell cat /proc/*/smaps | grep "Swap:" | awk '{sum += $2} END {print sum}';
141184

2. 240M used swap
SwapTotal: 461976 kB
SwapFree: 216808 kB

$ adb shell cat /proc/*/smaps | grep "SwapPss:" | awk '{sum += $2} END {print sum}';
230315
$ adb shell cat /proc/*/smaps | grep "Swap:" | awk '{sum += $2} END {print sum}';
1387744

[akpm@linux-foundation.org: simplify kunmap_atomic() call]
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: Bongkyu Kim <bongkyu.kim@lge.com>
Tested-by: Bongkyu Kim <bongkyu.kim@lge.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Matthew Wilcox 844f35db10 dax: add huge page fault support
This is the support code for DAX-enabled filesystems to allow them to
provide huge pages in response to faults.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Linus Torvalds 12f03ee606 libnvdimm for 4.3:
1/ Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
    mechanism for adding device-driver-discovered memory regions to the
    kernel's direct map.  This facility is used by the pmem driver to
    enable pfn_to_page() operations on the page frames returned by DAX
    ('direct_access' in 'struct block_device_operations'). For now, the
    'memmap' allocation for these "device" pages comes from "System
    RAM".  Support for allocating the memmap from device memory will
    arrive in a later kernel.
 
 2/ Introduce memremap() to replace usages of ioremap_cache() and
    ioremap_wt().  memremap() drops the __iomem annotation for these
    mappings to memory that do not have i/o side effects.  The
    replacement of ioremap_cache() with memremap() is limited to the
    pmem driver to ease merging the api change in v4.3.  Completion of
    the conversion is targeted for v4.4.
 
 3/ Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
    driver, update the VFS DAX implementation and PMEM api to provide
    persistence guarantees for kernel operations on a DAX mapping.
 
 4/ Convert the ACPI NFIT 'BLK' driver to map the block apertures as
    cacheable to improve performance.
 
 5/ Miscellaneous updates and fixes to libnvdimm including support
    for issuing "address range scrub" commands, clarifying the optimal
    'sector size' of pmem devices, a clarification of the usage of the
    ACPI '_STA' (status) property for DIMM devices, and other minor
    fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV6Nx7AAoJEB7SkWpmfYgCWyYQAI5ju6Gvw27RNFtPovHcZUf5
 JGnxXejI6/AqeTQ+IulgprxtEUCrXOHjCDA5dkjr1qvsoqK1qxug+vJHOZLgeW0R
 OwDtmdW4Qrgeqm+CPoxETkorJ8wDOc8mol81kTiMgeV3UqbYeeHIiTAmwe7VzZ0C
 nNdCRDm5g8dHCjTKcvK3rvozgyoNoWeBiHkPe76EbnxDICxCB5dak7XsVKNMIVFQ
 NuYlnw6IYN7+rMHgpgpRux38NtIW8VlYPWTmHExejc2mlioWMNBG/bmtwLyJ6M3e
 zliz4/cnonTMUaizZaVozyinTa65m7wcnpjK+vlyGV2deDZPJpDRvSOtB0lH30bR
 1gy+qrKzuGKpaN6thOISxFLLjmEeYwzYd7SvC9n118r32qShz+opN9XX0WmWSFlA
 sajE1ehm4M7s5pkMoa/dRnAyR8RUPu4RNINdQ/Z9jFfAOx+Q26rLdQXwf9+uqbEb
 bIeSQwOteK5vYYCstvpAcHSMlJAglzIX5UfZBvtEIJN7rlb0VhmGWfxAnTu+ktG1
 o9cqAt+J4146xHaFwj5duTsyKhWb8BL9+xqbKPNpXEp+PbLsrnE/+WkDLFD67jxz
 dgIoK60mGnVXp+16I2uMqYYDgAyO5zUdmM4OygOMnZNa1mxesjbDJC6Wat1Wsndn
 slsw6DkrWT60CRE42nbK
 =o57/
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:
 "This update has successfully completed a 0day-kbuild run and has
  appeared in a linux-next release.  The changes outside of the typical
  drivers/nvdimm/ and drivers/acpi/nfit.[ch] paths are related to the
  removal of IORESOURCE_CACHEABLE, the introduction of memremap(), and
  the introduction of ZONE_DEVICE + devm_memremap_pages().

  Summary:

   - Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
     mechanism for adding device-driver-discovered memory regions to the
     kernel's direct map.

     This facility is used by the pmem driver to enable pfn_to_page()
     operations on the page frames returned by DAX ('direct_access' in
     'struct block_device_operations').

     For now, the 'memmap' allocation for these "device" pages comes
     from "System RAM".  Support for allocating the memmap from device
     memory will arrive in a later kernel.

   - Introduce memremap() to replace usages of ioremap_cache() and
     ioremap_wt().  memremap() drops the __iomem annotation for these
     mappings to memory that do not have i/o side effects.  The
     replacement of ioremap_cache() with memremap() is limited to the
     pmem driver to ease merging the api change in v4.3.

     Completion of the conversion is targeted for v4.4.

   - Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
     driver, update the VFS DAX implementation and PMEM api to provide
     persistence guarantees for kernel operations on a DAX mapping.

   - Convert the ACPI NFIT 'BLK' driver to map the block apertures as
     cacheable to improve performance.

   - Miscellaneous updates and fixes to libnvdimm including support for
     issuing "address range scrub" commands, clarifying the optimal
     'sector size' of pmem devices, a clarification of the usage of the
     ACPI '_STA' (status) property for DIMM devices, and other minor
     fixes"

* tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (34 commits)
  libnvdimm, pmem: direct map legacy pmem by default
  libnvdimm, pmem: 'struct page' for pmem
  libnvdimm, pfn: 'struct page' provider infrastructure
  x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB
  add devm_memremap_pages
  mm: ZONE_DEVICE for "device memory"
  mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h
  dax: drop size parameter to ->direct_access()
  nd_blk: change aperture mapping from WC to WB
  nvdimm: change to use generic kvfree()
  pmem, dax: have direct_access use __pmem annotation
  dax: update I/O path to do proper PMEM flushing
  pmem: add copy_from_iter_pmem() and clear_pmem()
  pmem, x86: clean up conditional pmem includes
  pmem: remove layer when calling arch_has_wmb_pmem()
  pmem, x86: move x86 PMEM API to new pmem.h header
  libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option
  pmem: switch to devm_ allocations
  devres: add devm_memremap
  libnvdimm, btt: write and validate parent_uuid
  ...
2015-09-08 14:35:59 -07:00
Linus Torvalds 17447717a3 Nothing major, but:
- Add Jeff Layton as an nfsd co-maintainer: no change to
           existing practice, just an acknowledgement of the status quo.
         - Two patches ("nfsd: ensure that...") for a race overlooked by
           the state locking rewrite, causing a crash noticed by multiple
           users.
         - Lots of smaller bugfixes all over from Kinglong Mee.
         - From Jeff, some cleanup of server rpc code in preparation for
           possible shift of nfsd threads to workqueues.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV6fbLAAoJECebzXlCjuG+qGkP/j2YnZynwqCa4uz1+FU7qfYI
 kZWNGFFQ7O7e1i9Wznp7BkSA020rvM5d1HPwZhtstURM3i52XWRtbppwKF2+IuEU
 tpNdPKb28BPCZO29Z8mQk9IS2sX5jmBiibXRqBk0VK7e43PXrIwg1LJJ9HOfOpLh
 b1MvxdEB7vqK+fAVIYyhlg0UDd5AHAkQ+vS8YuohRXbDcsdhhE4vmusLlUl5UKp8
 5Yunz+b+pXfXPYaKidmpar6U2KoRSTPP1uO3bNfN6URO1W1nchPadLs0DnsBKlhb
 U8II5RZEmc+YfiIMoeptkJHoNhWT6Zu7CNJR6B0USTKv4L6TmFQVpxptVutzYVwx
 sGJ65lvCiXXOPz8JJwvBty//HTmbyOiCm64/vMbhQRlSNLSmcmTXEpw/uT5Huaxx
 bX9lnznoVVCd3eRoXPwMdZTbg/uEKqREZsQWVoqA6gexYqeyp79kvGbttLoUJ27Z
 IjtNb9W6akxfPKrHMgan6j7dy866o6TdSfWRayHwUoswbNnVOnMYKHjApOtF0oev
 k2pdLuy9tjl2a9Ow9sSwHZDbNsXgJO76E0aYnSTBP/YvctlG7KoZ+E0oxa6DWTC+
 0dE+g1xhIuUtW5WRL4pfWWk1G7jnf16J91bKkn91VveDn666RncAbLBtePmpIcIu
 5Ah6KxztTVCW++i5pmHh
 =aecc
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-4.3' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
 "Nothing major, but:

   - Add Jeff Layton as an nfsd co-maintainer: no change to existing
     practice, just an acknowledgement of the status quo.

   - Two patches ("nfsd: ensure that...") for a race overlooked by the
     state locking rewrite, causing a crash noticed by multiple users.

   - Lots of smaller bugfixes all over from Kinglong Mee.

   - From Jeff, some cleanup of server rpc code in preparation for
     possible shift of nfsd threads to workqueues"

* tag 'nfsd-4.3' of git://linux-nfs.org/~bfields/linux: (52 commits)
  nfsd: deal with DELEGRETURN racing with CB_RECALL
  nfsd: return CLID_INUSE for unexpected SETCLIENTID_CONFIRM case
  nfsd: ensure that delegation stateid hash references are only put once
  nfsd: ensure that the ol stateid hash reference is only put once
  net: sunrpc: fix tracepoint Warning: unknown op '->'
  nfsd: allow more than one laundry job to run at a time
  nfsd: don't WARN/backtrace for invalid container deployment.
  fs: fix fs/locks.c kernel-doc warning
  nfsd: Add Jeff Layton as co-maintainer
  NFSD: Return word2 bitmask if setting security label in OPEN/CREATE
  NFSD: Set the attributes used to store the verifier for EXCLUSIVE4_1
  nfsd: SUPPATTR_EXCLCREAT must be encoded before SECURITY_LABEL.
  nfsd: Fix an FS_LAYOUT_TYPES/LAYOUT_TYPES encode bug
  NFSD: Store parent's stat in a separate value
  nfsd: Fix two typos in comments
  lockd: NLM grace period shouldn't block NFSv4 opens
  nfsd: include linux/nfs4.h in export.h
  sunrpc: Switch to using hash list instead single list
  sunrpc/nfsd: Remove redundant code by exports seq_operations functions
  sunrpc: Store cache_detail in seq_file's private directly
  ...
2015-09-05 17:26:24 -07:00
Linus Torvalds 4c12ab7e5e Merge tag 'for-f2fs-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
 "The major work includes fixing and enhancing the existing extent_cache
  feature, which has been well settling down so far and now it becomes a
  default mount option accordingly.

  Also, this version newly registers a f2fs memory shrinker to reclaim
  several objects consumed by a couple of data structures in order to
  avoid memory pressures.

  Another new feature is to add ioctl(F2FS_GARBAGE_COLLECT) which
  triggers a cleaning job explicitly by users.

  Most of the other patches are to fix bugs occurred in the corner cases
  across the whole code area"

* tag 'for-f2fs-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (85 commits)
  f2fs: upset segment_info repair
  f2fs: avoid accessing NULL pointer in f2fs_drop_largest_extent
  f2fs: update extent tree in batches
  f2fs: fix to release inode correctly
  f2fs: handle f2fs_truncate error correctly
  f2fs: avoid unneeded initializing when converting inline dentry
  f2fs: atomically set inode->i_flags
  f2fs: fix wrong pointer access during try_to_free_nids
  f2fs: use __GFP_NOFAIL to avoid infinite loop
  f2fs: lookup neighbor extent nodes for merging later
  f2fs: split __insert_extent_tree_ret for readability
  f2fs: kill dead code in __insert_extent_tree
  f2fs: adjust showing of extent cache stat
  f2fs: add largest/cached stat in extent cache
  f2fs: fix incorrect mapping for bmap
  f2fs: add annotation for space utilization of regular/inline dentry
  f2fs: fix to update cached_en of extent tree properly
  f2fs: fix typo
  f2fs: check the node block address of newly allocated nid
  f2fs: go out for insert_inode_locked failure
  ...
2015-09-03 13:10:22 -07:00
Linus Torvalds e31fb9e005 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull ext3 removal, quota & udf fixes from Jan Kara:
 "The biggest change in the pull is the removal of ext3 filesystem
  driver (~28k lines removed).  Ext4 driver is a full featured
  replacement these days and both RH and SUSE use it for several years
  without issues.  Also there are some workarounds in VM & block layer
  mainly for ext3 which we could eventually get rid of.

  Other larger change is addition of proper error handling for
  dquot_initialize().  The rest is small fixes and cleanups"

[ I wasn't convinced about the ext3 removal and worried about things
  falling through the cracks for legacy users, but ext4 maintainers
  piped up and were all unanimously in favor of removal, and maintaining
  all legacy ext3 support inside ext4.   - Linus ]

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: Don't modify filesystem for read-only mounts
  quota: remove an unneeded condition
  ext4: memory leak on error in ext4_symlink()
  mm/Kconfig: NEED_BOUNCE_POOL: clean-up condition
  ext4: Improve ext4 Kconfig test
  block: Remove forced page bouncing under IO
  fs: Remove ext3 filesystem driver
  doc: Update doc about journalling layer
  jfs: Handle error from dquot_initialize()
  reiserfs: Handle error from dquot_initialize()
  ocfs2: Handle error from dquot_initialize()
  ext4: Handle error from dquot_initialize()
  ext2: Handle error from dquot_initalize()
  quota: Propagate error from ->acquire_dquot()
2015-09-03 12:28:30 -07:00
Linus Torvalds e2701603f7 There's been a fair amount going on in the docs tree this time around,
including:
 
  - Support for reproducible document builds, from Ben Hutchings and
    company.
 
  - The ability to automatically generate cross-reference links within a
    single DocBook book and embedded descriptions for large structures.
    From Danilo Cesar Lemes de Paula.
 
  - A new document on how to add a system call from David Drysdale.
 
  - Chameleon bus documentation from Johannes Thumshirn.
 
 ...plus the usual collection of improvements, typo fixes, and more.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV4/B0AAoJEI3ONVYwIuV6Y1UQAIpU1TSPCuRcgPLgKhuEty9w
 nMKA/Rn2Wye0608HbZ1FhpUXTS914kYOF0zA6f4xS1kOGoqqgSOfkP/bXfdZP67P
 aH1onug5xFIfMUTGT2tAHabPymbGCZARVe/1YYKPTTh7hu4ZnPd7HULelp/KHa7Y
 9yl/VQzggu4ASWXTKU89vZmoNSvUf+e73sCB+ZQ69QgY2JAnK9HaWDjOpWesev21
 ZSZneWrvVWngupWAsw8Wy+QVbqEEIMd5+XC7hN+GEPZInuRGr5oAuyUgmO90JiER
 WmFH5D4vRi3KG5XLmvHdUA5lhOqGM3cZC8W9Xw7byf86NVdWoN9rVs4IhBJjC7PC
 v0foIuORKbuxqXJ/bn2pMXuWEq9EU80mvs+Hot7cMTP06syXY2/ROcSaReJ9TjcU
 yw9uY8tOiR5Pq7tfqDylBbsJKlgYSatdsZKacLG5rCuUwLhsxlTaxG+yyg4zIfeb
 EBvsRZBUTElzBELHxTwsOmUJf98QvUEuq8EHfGhUIR0IAnzzSwe2rccCtzohI2je
 Sk0R3W9kKZdrgOr8vBehPEXxUYFDoWx+6MgpJhD7eSMhuL6510Gp5AV7Fj/+tTgE
 8aTfubk2CGES5Aiwnyi8+Jf+LelPcWpKU4p3rsbsjnrbREV7YzxFHwtlCGtFYBhG
 xWCMG47K4Wx8ynKyUE5m
 =pQ1H
 -----END PGP SIGNATURE-----

Merge tag 'docs-for-linus' of git://git.lwn.net/linux-2.6

Pull documentation updates from Jonathan Corbet:
 "There's been a fair amount going on in the docs tree this time around,
  including:

   - Support for reproducible document builds, from Ben Hutchings and
     company.

   - The ability to automatically generate cross-reference links within
     a single DocBook book and embedded descriptions for large
     structures.  From Danilo Cesar Lemes de Paula.

   - A new document on how to add a system call from David Drysdale.

   - Chameleon bus documentation from Johannes Thumshirn.

  ...plus the usual collection of improvements, typo fixes, and more"

* tag 'docs-for-linus' of git://git.lwn.net/linux-2.6: (39 commits)
  Documentation, add kernel-parameters.txt entry for dis_ucode_ldr
  Documentation/x86: Rename IRQSTACKSIZE to IRQ_STACK_SIZE
  Documentation/Intel-IOMMU.txt: Modify definition of DRHD
  docs: update HOWTO for 3.x -> 4.x versioning
  kernel-doc: ignore unneeded attribute information
  scripts/kernel-doc: Adding cross-reference links to html documentation.
  DocBook: Fix non-determinstic installation of duplicate man pages
  Documentation: minor typo fix in mailbox.txt
  Documentation: describe how to add a system call
  doc: Add more workqueue functions to the documentation
  ARM: keystone: add documentation for SoCs and EVMs
  scripts/kernel-doc Allow struct arguments documentation in struct body
  SubmittingPatches: remove stray quote character
  Revert "DocBook: Avoid building man pages repeatedly and inconsistently"
  Documentation: Minor changes to men-chameleon-bus.txt
  Doc: fix trivial typo in SubmittingPatches
  MAINTAINERS: Direct Documentation/DocBook/media properly
  Documentation: installed man pages don't need to be executable
  fix Evolution submenu name in email-clients.txt
  Documentation: Add MCB documentation
  ...
2015-08-31 15:40:05 -07:00
Ross Zwisler e2e05394e4 pmem, dax: have direct_access use __pmem annotation
Update the annotation for the kaddr pointer returned by direct_access()
so that it is a __pmem pointer.  This is consistent with the PMEM driver
and with how this direct_access() pointer is used in the DAX code.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-20 14:07:24 -04:00
Seymour, Shane M 223e8f01b0 sysfs.txt: update show method notes about sprintf/snprintf/scnprintf usage
Changed the documentation to allow sprintf() when the buffer
provided by sysfs cannot be overflowed. Explicitly say
snprintf() must never be used in a show function to format
data to be returned to user space.

Change based on a discussion about the patch
st: convert DRIVER_ATTR macros to DRIVER_ATTR_RO

Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Shane Seymour <shane.seymour@hp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 15:13:21 -07:00
Jaegeuk Kim 7daaea256d f2fs: add noextent_cache mount option
This patch adds noextent_cache mount option.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-08-04 14:09:55 -07:00
Wang Long 9e1aa7c888 Documentation: Update filesystems/debugfs.txt
This patch update the Documentation/filesystems/debugfs.txt
file. The main work is to add the description of the following
functions:
    debugfs_create_atomic_t
    debugfs_create_u32_array
    debugfs_create_devm_seqfile
    debugfs_create_file_size

Signed-off-by: Wang Long <long.wanglong@huawei.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2015-07-24 15:05:56 +02:00
Jan Kara c290ea01ab fs: Remove ext3 filesystem driver
The functionality of ext3 is fully supported by ext4 driver. Major
distributions (SUSE, RedHat) already use ext4 driver to handle ext3
filesystems for quite some time. There is some ugliness in mm resulting
from jbd cleaning buffers in a dirty page without cleaning page dirty
bit and also support for buffer bouncing in the block layer when stable
pages are required is there only because of jbd. So let's remove the
ext3 driver. This saves us some 28k lines of duplicated code.

Acked-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-07-23 20:59:40 +02:00
J. Bruce Fields 28e51af7c6 Revert "Documentation: NFS/RDMA: Document separate Kconfig symbols"
This reverts commit 731d5cca82.

Commit ffe1f0df58 ("rpcrdma: Merge svcrdma and xprtrdma modules into
one") forgot to update the corresponding documentation.

Reported-by: Valentin Rothberg <valentinrothberg@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-07-20 14:58:46 -04:00
Daniel Grimshaw bd760c498f Documentation: filesystems: btrfs: Fixed typos and whitespace
I am a high school student trying to become familiar with
Linux kernel development. The btrfs documentation in
Documentation/filesystems had a few typos and errors in
whitespace. This patch corrects both of these.

Signed-off-by: Daniel Grimshaw <grimshaw@linux.vnet.ibm.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2015-07-09 14:31:06 -06:00
Linus Torvalds 1dc51b8288 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
 "Assorted VFS fixes and related cleanups (IMO the most interesting in
  that part are f_path-related things and Eric's descriptor-related
  stuff).  UFS regression fixes (it got broken last cycle).  9P fixes.
  fs-cache series, DAX patches, Jan's file_remove_suid() work"

[ I'd say this is much more than "fixes and related cleanups".  The
  file_table locking rule change by Eric Dumazet is a rather big and
  fundamental update even if the patch isn't huge.   - Linus ]

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (49 commits)
  9p: cope with bogus responses from server in p9_client_{read,write}
  p9_client_write(): avoid double p9_free_req()
  9p: forgetting to cancel request on interrupted zero-copy RPC
  dax: bdev_direct_access() may sleep
  block: Add support for DAX reads/writes to block devices
  dax: Use copy_from_iter_nocache
  dax: Add block size note to documentation
  fs/file.c: __fget() and dup2() atomicity rules
  fs/file.c: don't acquire files->file_lock in fd_install()
  fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation
  vfs: avoid creation of inode number 0 in get_next_ino
  namei: make set_root_rcu() return void
  make simple_positive() public
  ufs: use dir_pages instead of ufs_dir_pages()
  pagemap.h: move dir_pages() over there
  remove the pointless include of lglock.h
  fs: cleanup slight list_entry abuse
  xfs: Correctly lock inode when removing suid and file capabilities
  fs: Call security_ops->inode_killpriv on truncate
  fs: Provide function telling whether file_remove_privs() will do anything
  ...
2015-07-04 19:36:06 -07:00
Matthew Wilcox 44f4c054ca dax: Add block size note to documentation
For block devices which are small enough, mkfs will default to creating
a filesystem with block sizes smaller than page size.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-04 15:56:56 -04:00
Eric Dumazet 8a81252b77 fs/file.c: don't acquire files->file_lock in fd_install()
Mateusz Guzik reported :

 Currently obtaining a new file descriptor results in locking fdtable
 twice - once in order to reserve a slot and second time to fill it.

Holding the spinlock in __fd_install() is needed in case a resize is
done, or to prevent a resize.

Mateusz provided an RFC patch and a micro benchmark :
  http://people.redhat.com/~mguzik/pipebench.c

A resize is an unlikely operation in a process lifetime,
as table size is at least doubled at every resize.

We can use RCU instead of the spinlock.

__fd_install() must wait if a resize is in progress.

The resize must block new __fd_install() callers from starting,
and wait that ongoing install are finished (synchronize_sched())

resize should be attempted by a single thread to not waste resources.

rcu_sched variant is used, as __fd_install() and expand_fdtable() run
from process context.

It gives us a ~30% speedup using pipebench on a dual Intel(R) Xeon(R)
CPU E5-2696 v2 @ 2.50GHz

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Mateusz Guzik <mguzik@redhat.com>
Acked-by: Mateusz Guzik <mguzik@redhat.com>
Tested-by: Mateusz Guzik <mguzik@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-01 02:30:09 -04:00
Linus Torvalds 68b4449d79 xfs: update for 4.2-rc1
This update contains:
 
 o A new sparse on-disk inode record format to allow small extents to
   be used for inode allocation when free space is fragmented.
 o DAX support. This includes minor changes to the DAX core code to
   fix problems with lock ordering and bufferhead mapping abuse.
 o transaction commit interface cleanup
 o removal of various unnecessary XFS specific type definitions
 o cleanup and optimisation of freelist preparation before allocation
 o various minor cleanups
 o bug fixes for
 	- transaction reservation leaks
 	- incorrect inode logging in unwritten extent conversion
 	- mmap lock vs freeze ordering
 	- remote symlink mishandling
 	- attribute fork removal issues.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJVkhI0AAoJEK3oKUf0dfod45MQAJCOEkNduBdlfPvTCMPjj/7z
 vzcfDdzgKwhpPTMXSDRvw4zDPt3C2FLMBJqxtPpC4sKGKG/8G0kFvw8bDtBag1m9
 ru5nI5LaQ6LC5RcU40zxBx1s/L8qYvyfUlxeoOT5lSwN9c6ENGOCQ3bUk4pSKaee
 pWDplag9LbfQomW2GHtxd8agMUZEYx0R1vgfv88V8xgPka8CvQo81XUgkb4PcDZV
 ugR+wDUsvwMS01aLYBmRFkMXuExNuCJVwtvdTJS+ZWGHzyTpulFoANUW6QT24gAM
 eP4yRXN4bv9vXrXpg8JkF25DHsfw4HBwNEL17ZvoB8t3oJp1/NYaH8ce1jS0+I8i
 NCtaO+qUqDSTGQZKgmeDPwCciQp54ra9LEdmIJFxpZxiBof9g/tIYEFgRklyFLwR
 GZU6Io6VpBa1oTGlC4D1cmG6bdcnhMB9MGVVCbqnB5mRRDKCmVgCyJwusd1pi7Re
 G4O6KkFt21O7+fP13VsjP57KoaJzsIgZ/+H3Ff/fJOJ33AKYTRCmwi8+IMi2n5JI
 zz+V0AIBQZAx9dlVyENnxufh9eJYcnwta0lUSLCCo91fZKxbo3ktK1kVHNZP5EGs
 IMFM1Ka6hibY20rWlR3GH0dfyP5/yNcvNgTMYPKjj9SVjTar1aSfF2rGpkqYXYyH
 D4FICbtDgtOc2ClfpI2k
 =3x+W
 -----END PGP SIGNATURE-----

Merge tag 'xfs-for-linus-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs

Pul xfs updates from Dave Chinner:
 "There's a couple of small API changes to the core DAX code which
  required small changes to the ext2 and ext4 code bases, but otherwise
  everything is within the XFS codebase.

  This update contains:

   - A new sparse on-disk inode record format to allow small extents to
     be used for inode allocation when free space is fragmented.

   - DAX support.  This includes minor changes to the DAX core code to
     fix problems with lock ordering and bufferhead mapping abuse.

   - transaction commit interface cleanup

   - removal of various unnecessary XFS specific type definitions

   - cleanup and optimisation of freelist preparation before allocation

   - various minor cleanups

   - bug fixes for
	- transaction reservation leaks
	- incorrect inode logging in unwritten extent conversion
	- mmap lock vs freeze ordering
	- remote symlink mishandling
	- attribute fork removal issues"

* tag 'xfs-for-linus-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (49 commits)
  xfs: don't truncate attribute extents if no extents exist
  xfs: clean up XFS_MIN_FREELIST macros
  xfs: sanitise error handling in xfs_alloc_fix_freelist
  xfs: factor out free space extent length check
  xfs: xfs_alloc_fix_freelist() can use incore perag structures
  xfs: remove xfs_caddr_t
  xfs: use void pointers in log validation helpers
  xfs: return a void pointer from xfs_buf_offset
  xfs: remove inst_t
  xfs: remove __psint_t and __psunsigned_t
  xfs: fix remote symlinks on V5/CRC filesystems
  xfs: fix xfs_log_done interface
  xfs: saner xfs_trans_commit interface
  xfs: remove the flags argument to xfs_trans_cancel
  xfs: pass a boolean flag to xfs_trans_free_items
  xfs: switch remaining xfs_trans_dup users to xfs_trans_roll
  xfs: check min blks for random debug mode sparse allocations
  xfs: fix sparse inodes 32-bit compile failure
  xfs: add initial DAX support
  xfs: add DAX IO path support
  ...
2015-06-30 20:16:08 -07:00
Linus Torvalds d2c3ac7e7e Merge branch 'for-4.2' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
 "A relatively quiet cycle, with a mix of cleanup and smaller bugfixes"

* 'for-4.2' of git://linux-nfs.org/~bfields/linux: (24 commits)
  sunrpc: use sg_init_one() in krb5_rc4_setup_enc/seq_key()
  nfsd: wrap too long lines in nfsd4_encode_read
  nfsd: fput rd_file from XDR encode context
  nfsd: take struct file setup fully into nfs4_preprocess_stateid_op
  nfsd: refactor nfs4_preprocess_stateid_op
  nfsd: clean up raparams handling
  nfsd: use swap() in sort_pacl_range()
  rpcrdma: Merge svcrdma and xprtrdma modules into one
  svcrdma: Add a separate "max data segs macro for svcrdma
  svcrdma: Replace GFP_KERNEL in a loop with GFP_NOFAIL
  svcrdma: Keep rpcrdma_msg fields in network byte-order
  svcrdma: Fix byte-swapping in svc_rdma_sendto.c
  nfsd: Update callback sequnce id only CB_SEQUENCE success
  nfsd: Reset cb_status in nfsd4_cb_prepare() at retrying
  svcrdma: Remove svc_rdma_xdr_decode_deferred_req()
  SUNRPC: Move EXPORT_SYMBOL for svc_process
  uapi/nfs: Add NFSv4.1 ACL definitions
  nfsd: Remove dead declarations
  nfsd: work around a gcc-5.1 warning
  nfsd: Checking for acl support does not require fetching any acls
  ...
2015-06-27 10:14:39 -07:00
Linus Torvalds a7296b49fb Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF fixes and cleanups from Jan Kara:
 "The contains some small fixes and improvements in error handling for
  UDF.

  Bundled is also one ext3 coding style fix and a fix in quota
  documentation"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: fix udf_load_pvoldesc()
  udf: remove double err declaration in udf_file_write_iter()
  UDF: support NFSv2 export
  fs: ext3: super: fixed a space coding style issue
  quota: Update documentation
  udf: Return error from udf_find_entry()
  udf: Make udf_get_filename() return error instead of 0 length file name
  udf: bug on exotic flag in udf_get_filename()
  udf: improve error management in udf_CS0toNLS()
  udf: improve error management in udf_CS0toUTF8()
  udf: unicode: update function name in comments
  udf: remove unnecessary test in udf_build_ustr_exact()
  udf: Return -ENOMEM when allocation fails in udf_get_filename()
2015-06-24 20:07:10 -07:00
Linus Torvalds 1e467e68e5 Documentation updates for 4.2
The main thing here is Ingo's big subdirectory documenting feature support
 for each architecture.  Beyond that, it's the usual pile of fixes, tweaks,
 and small additions.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVi0g2AAoJEI3ONVYwIuV6Me4QAIfa79z05ABSjlyWaKw46plH
 lULR9cyHdR59JVPHKjSOfT9/c+GOdoz6kkXQoe/TgVyj5fRB8seUW5GJXCASndkk
 aVd4c6yKFH1NISXsSdVQC0JbpgAURgcSR6x59It++fG3NINvXronFTWGMBHMLKcI
 A2hM2jNP914Dy5r4ipWZKzF1KxIlqK9kmLxlNoE6/LoQfBhh1dMdnyfuM11sguAy
 s5pr9JeCPbWC0RE7st/qEivXF4lpj6hd3XoYfM2Y+oukj5xEPQevLTLHOgtesnx9
 guUAul5Sw27n+Dx8I0Qxf1n+5SkrijoAa72g5vAxTs+ilOey67qba012NaYSy7RK
 s15XOIZ/1JTS9JjkO7GR5NbG6AiIIAH5P+Y501ivCIrsWciTOgKj7cOzakIEV8/P
 NX4120Lh5lbBrWeYkl8WbgMO0Me8cThbALC+rncF/wjvGyREKyxNlZ9qvBqmHYjG
 5Et2DT+rANaDmmblgMK3tX/zI1g3pN51e+CRF+Hzh1jZD3MZ/i+KS4qgfGFDzMIj
 uoniO5VfyD4zRbyv4Grg7XMpXiP8xFxKDypglYiXzzwlkarUgbMGOoFE7AkiPOKB
 t9gLPetbDsDyU/bSpzHlfObZp+q+pCxHPhyLS7hxEi3gBxYajIMbkpHHJugnE0+H
 TfkIhy6QQm1vAPTpRXaE
 =ODt8
 -----END PGP SIGNATURE-----

Merge tag 'docs-for-linus' of git://git.lwn.net/linux-2.6

Pull documentation updates from Jonathan Corbet:
 "The main thing here is Ingo's big subdirectory documenting feature
  support for each architecture.  Beyond that, it's the usual pile of
  fixes, tweaks, and small additions"

* tag 'docs-for-linus' of git://git.lwn.net/linux-2.6: (79 commits)
  doc:md: fix typo in md.txt.
  Documentation/mic/mpssd: don't build x86 userspace when cross compiling
  Documentation/prctl: don't build tsc tests when cross compiling
  Documentation/vDSO: don't build tests when cross compiling
  Doc:ABI/testing: Fix typo in sysfs-bus-fcoe
  Doc: Docbook: Change wikipedia's URL from http to https in scsi.tmpl
  Doc: Change wikipedia's URL from http to https
  Documentation/kernel-parameters: add missing pciserial to the earlyprintk
  Doc:pps: Fix typo in pps.txt
  kbuild : Fix documentation of INSTALL_HDR_PATH
  Documentation: filesystems: updated struct file_operations documentation in vfs.txt
  kbuild: edit explanation of clean-files variable
  Doc: ja_JP: Fix typo in HOWTO
  Move freefall program from Documentation/ to tools/
  Documentation: ARM: EXYNOS: Describe boot loaders interface
  Doc:nfc: Fix typo in nfc-hci.txt
  vfs: Minor documentation fix
  Doc: networking: txtimestamp: fix printf format warning
  Documentation, intel_pstate: Improve legacy mode internal governors description
  Documentation: extend use case for EXPORT_SYMBOL_GPL()
  ...
2015-06-24 20:01:36 -07:00
Al Viro 8ea3a7c0df Merge branch 'fscache-fixes' into for-next 2015-06-23 18:01:30 -04:00
Thomas de Beauchene 0d03943c0b Documentation: filesystems: updated struct file_operations documentation in vfs.txt
Updated struct file_operations documentation in vfs.txt to match
current implementation

Signed-off-by: Thomas de Beauchene <chauvo_t@epitech.eu>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2015-06-08 17:01:14 -06:00
Andreas Gruenbacher 47016077b6 vfs: Minor documentation fix
The check_acl inode operation and the IPERM_FLAG_RCU flag are long gone; update
the documentation.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2015-06-05 08:45:47 +09:00
Fanael Linithien 4d66ea091a xfs: fix kernel version in docs
Linux v3.20 was released as v4.0.

Signed-off-by: Fanael Linithien <fanael4@gmail.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-06-01 07:15:38 +10:00
Jan Kara 18e25fa403 quota: Update documentation
Mention that quota netlink messages are sent only in initial network
namespace.

Signed-off-by: Jan Kara <jack@suse.cz>
2015-05-18 11:23:07 +02:00
NeilBrown 99ff6cf0e6 Documentation: remove outdated information from automount-support.txt
The guidelines for adding automount support to a filesystem
in filesystems/automount-support.txt is out or date.
filesystems/autofs4.txt contains more current text, so replace
the out-of-date content with a reference to that.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:38 -04:00
Al Viro 203bc643db update Documentation/filesystems/ regarding the follow_link/put_link changes
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:36 -04:00
Al Viro 5f2c4179e1 switch ->put_link() from dentry to inode
only one instance looks at that argument at all; that sole
exception wants inode rather than dentry.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-11 08:13:12 -04:00
Al Viro 6e77137b36 don't pass nameidata to ->follow_link()
its only use is getting passed to nd_jump_link(), which can obtain
it from current->nameidata

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-10 22:20:15 -04:00
Al Viro 680baacbca new ->follow_link() and ->put_link() calling conventions
a) instead of storing the symlink body (via nd_set_link()) and returning
an opaque pointer later passed to ->put_link(), ->follow_link() _stores_
that opaque pointer (into void * passed by address by caller) and returns
the symlink body.  Returning ERR_PTR() on error, NULL on jump (procfs magic
symlinks) and pointer to symlink body for normal symlinks.  Stored pointer
is ignored in all cases except the last one.

Storing NULL for opaque pointer (or not storing it at all) means no call
of ->put_link().

b) the body used to be passed to ->put_link() implicitly (via nameidata).
Now only the opaque pointer is.  In the cases when we used the symlink body
to free stuff, ->follow_link() now should store it as opaque pointer in addition
to returning it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-10 22:19:45 -04:00