Commit Graph

868 Commits

Author SHA1 Message Date
Linus Torvalds b53c4d5eb7 This pull request contains updates for both UBI and UBIFS:
- New config option CONFIG_UBIFS_FS_SECURITY
 - Minor improvements
 - Random fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZFuwKAAoJEEtJtSqsAOnWYrUP/0/y7PEh0ZGdi4kkQy/CnuJr
 pmybsQ0TbLoljahuDXqShKkMNvuXIvKSKcHROIsXreG+DCfC3v/srZlvRt7UCPOE
 QVvjh0sQTaMUrfcaTTM9g3Im/BZX9MueTaSF2Rgx1lF+R2t3InW1bv9hvmQxfoEA
 N75tJgH69mii5pDWuGgLLjmmxhbSkMGpM31QeO5DUaLRqdXcc5L5iK5Hnd+Wtj81
 oSB5RsergCfk17jaWH2e7G03LB2tm6AhM5oksTOpZ9+OIW9GOiUMfjYFC2ZYRwzx
 zHhnh0rGPfFv0jO5u4CXtWaQDfxyw6Z7XLK+Xo1RemkhM/7AQl2xetfIVDXErgoA
 NxxN/a8MWcEpJ2x6y/Z8740HXjyQjt9h3nHzlVPNP8hz68J796E7UzRjCQtf7Iyh
 xqhfjMabxfBqcLkTESvgmjcuwo1IkqOaFBjIw2Cd2nfBEkCKzoaINjRHitgUGj/z
 Mm1CJNWvaK6QTdZ3iCETCyPQI02A+4ZXhDf/QZS3wRAMc1v45pS/dVeBn+0F8Nrc
 ASiQwcd7u1IfJa3A6d6DgMECUWBXjc1GGMfMyhS/ta56pOfe1RyR3bg9WuISqUMe
 86id9tiSs7cP2UVFTrFFFWAO3rATj+9cOO9f2LTujPzcd88cJhKSykaLPmELfyE9
 YUPw9lpExwyXLn7S46LQ
 =9ZJe
 -----END PGP SIGNATURE-----

Merge tag 'upstream-4.12-rc1' of git://git.infradead.org/linux-ubifs

Pull UBI/UBIFS updates from Richard Weinberger:

 - new config option CONFIG_UBIFS_FS_SECURITY

 - minor improvements

 - random fixes

* tag 'upstream-4.12-rc1' of git://git.infradead.org/linux-ubifs:
  ubi: Add debugfs file for tracking PEB state
  ubifs: Fix a typo in comment of ioctl2ubifs & ubifs2ioctl
  ubifs: Remove unnecessary assignment
  ubifs: Fix cut and paste error on sb type comparisons
  ubi: fastmap: Fix slab corruption
  ubifs: Add CONFIG_UBIFS_FS_SECURITY to disable/enable security labels
  ubi: Make mtd parameter readable
  ubi: Fix section mismatch
2017-05-13 10:23:12 -07:00
Linus Torvalds bf5f89463f Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - the rest of MM

 - various misc things

 - procfs updates

 - lib/ updates

 - checkpatch updates

 - kdump/kexec updates

 - add kvmalloc helpers, use them

 - time helper updates for Y2038 issues. We're almost ready to remove
   current_fs_time() but that awaits a btrfs merge.

 - add tracepoints to DAX

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (114 commits)
  drivers/staging/ccree/ssi_hash.c: fix build with gcc-4.4.4
  selftests/vm: add a test for virtual address range mapping
  dax: add tracepoint to dax_insert_mapping()
  dax: add tracepoint to dax_writeback_one()
  dax: add tracepoints to dax_writeback_mapping_range()
  dax: add tracepoints to dax_load_hole()
  dax: add tracepoints to dax_pfn_mkwrite()
  dax: add tracepoints to dax_iomap_pte_fault()
  mtd: nand: nandsim: convert to memalloc_noreclaim_*()
  treewide: convert PF_MEMALLOC manipulations to new helpers
  mm: introduce memalloc_noreclaim_{save,restore}
  mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC
  mm/huge_memory.c: deposit a pgtable for DAX PMD faults when required
  mm/huge_memory.c: use zap_deposited_table() more
  time: delete CURRENT_TIME_SEC and CURRENT_TIME
  gfs2: replace CURRENT_TIME with current_time
  apparmorfs: replace CURRENT_TIME with current_time()
  lustre: replace CURRENT_TIME macro
  fs: ubifs: replace CURRENT_TIME_SEC with current_time
  fs: ufs: use ktime_get_real_ts64() for birthtime
  ...
2017-05-08 18:17:56 -07:00
Deepa Dinamani 607a11ad94 fs: ubifs: replace CURRENT_TIME_SEC with current_time
CURRENT_TIME_SEC is not y2038 safe.  current_time() will be transitioned
to use 64 bit time along with vfs in a separate patch.  There is no plan
to transition CURRENT_TIME_SEC to use y2038 safe time interfaces.

current_time() returns timestamps according to the granularities set in
the inode's super_block.  The granularity check to call
current_fs_time() or CURRENT_TIME_SEC is not required.

Use current_time() directly to update inode timestamp.  Use
timespec_trunc during file system creation, before the first inode is
created.

Link: http://lkml.kernel.org/r/1491613030-11599-9-git-send-email-deepa.kernel@gmail.com
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-08 17:15:15 -07:00
Rock Lee 798868c021 ubifs: Fix a typo in comment of ioctl2ubifs & ubifs2ioctl
Change 'convert' to 'converts'
Change 'UBIFS' to 'UBIFS inode flags'

Signed-off-by: Rock Lee <rockdotlee@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2017-05-08 20:48:55 +02:00
Stefan Agner 2a068daf57 ubifs: Remove unnecessary assignment
Assigning a value of a variable to itself is not useful.

Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Richard Weinberger <richard@nod.at>
2017-05-08 20:48:47 +02:00
Colin Ian King 6a258f7d0f ubifs: Fix cut and paste error on sb type comparisons
The check for the bad node type of sb->type is checking sa->type
and not sb-type. This looks like a cut and paste error. Fix this.

Detected by PVS-Studio, warning: V581

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2017-05-08 20:48:41 +02:00
Hyunchul Lee 8326c1eec2 ubifs: Add CONFIG_UBIFS_FS_SECURITY to disable/enable security labels
When write syscall is called, every time security label is searched to
determine that file's privileges should be changed.
If LSM(Linux Security Model) is not used, this is useless.

So introduce CONFIG_UBIFS_SECURITY to disable security labels. it's default
value is "y".

Signed-off-by: Hyunchul Lee <cheol.lee@lge.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2017-05-08 20:48:23 +02:00
Linus Torvalds 677375cef8 Only bug fixes and cleanups for this merge window.
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAlkPYHkACgkQ8vlZVpUN
 gaM97ggAlOm8n/tlbcdonX/+HHjlnqcy5uYD7A9AH/JordpRzy4eqcMbxMG39p1R
 DBtjo9Y0i3iFEGajRc0h7KXDLeTBUQ/JZpR8H60MFfAQHnTowuI91eb3/6QeZiHh
 CN/2KKzpYitPIEUfEHnVeYKOfvrzR7je5hrEiAwEkPeKv7XyrNVM0LHQ/jKpbQwg
 ntIzHvxjQyo8plx/m5S4Yew7tqjYpNiq4plmyk/Vxtw2FmB/FC76UxYeadoB3EI5
 etw+bCORB0tFZO27o56kXywg+mDcp7HEtVvq9LG28oEuBDAVKNoeKEvV7SiOBlZp
 +HnqIz5Hx1UTxOlTAc10IjvEhriEuw==
 =qCDl
 -----END PGP SIGNATURE-----

Merge tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt

Pull fscrypt updates from Ted Ts'o:
 "Only bug fixes and cleanups for this merge window"

* tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt:
  fscrypt: correct collision claim for digested names
  MAINTAINERS: fscrypt: update mailing list, patchwork, and git
  ext4: clean up ext4_match() and callers
  f2fs: switch to using fscrypt_match_name()
  ext4: switch to using fscrypt_match_name()
  fscrypt: introduce helper function for filename matching
  fscrypt: avoid collisions when presenting long encrypted filenames
  f2fs: check entire encrypted bigname when finding a dentry
  ubifs: check for consistent encryption contexts in ubifs_lookup()
  f2fs: sync f2fs_lookup() with ext4_lookup()
  ext4: remove "nokey" check from ext4_lookup()
  fscrypt: fix context consistency check when key(s) unavailable
  fscrypt: Remove __packed from fscrypt_policy
  fscrypt: Move key structure and constants to uapi
  fscrypt: remove fscrypt_symlink_data_len()
  fscrypt: remove unnecessary checks for NULL operations
2017-05-08 11:40:34 -07:00
Eric Biggers 413d5a9edb ubifs: check for consistent encryption contexts in ubifs_lookup()
As ext4 and f2fs do, ubifs should check for consistent encryption
contexts during ->lookup() in an encrypted directory.  This protects
certain users of filesystem encryption against certain types of offline
attacks.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-05-04 11:44:35 -04:00
Linus Torvalds 694752922b Merge branch 'for-4.12/block' of git://git.kernel.dk/linux-block
Pull block layer updates from Jens Axboe:

 - Add BFQ IO scheduler under the new blk-mq scheduling framework. BFQ
   was initially a fork of CFQ, but subsequently changed to implement
   fairness based on B-WF2Q+, a modified variant of WF2Q. BFQ is meant
   to be used on desktop type single drives, providing good fairness.
   From Paolo.

 - Add Kyber IO scheduler. This is a full multiqueue aware scheduler,
   using a scalable token based algorithm that throttles IO based on
   live completion IO stats, similary to blk-wbt. From Omar.

 - A series from Jan, moving users to separately allocated backing
   devices. This continues the work of separating backing device life
   times, solving various problems with hot removal.

 - A series of updates for lightnvm, mostly from Javier. Includes a
   'pblk' target that exposes an open channel SSD as a physical block
   device.

 - A series of fixes and improvements for nbd from Josef.

 - A series from Omar, removing queue sharing between devices on mostly
   legacy drivers. This helps us clean up other bits, if we know that a
   queue only has a single device backing. This has been overdue for
   more than a decade.

 - Fixes for the blk-stats, and improvements to unify the stats and user
   windows. This both improves blk-wbt, and enables other users to
   register a need to receive IO stats for a device. From Omar.

 - blk-throttle improvements from Shaohua. This provides a scalable
   framework for implementing scalable priotization - particularly for
   blk-mq, but applicable to any type of block device. The interface is
   marked experimental for now.

 - Bucketized IO stats for IO polling from Stephen Bates. This improves
   efficiency of polled workloads in the presence of mixed block size
   IO.

 - A few fixes for opal, from Scott.

 - A few pulls for NVMe, including a lot of fixes for NVMe-over-fabrics.
   From a variety of folks, mostly Sagi and James Smart.

 - A series from Bart, improving our exposed info and capabilities from
   the blk-mq debugfs support.

 - A series from Christoph, cleaning up how handle WRITE_ZEROES.

 - A series from Christoph, cleaning up the block layer handling of how
   we track errors in a request. On top of being a nice cleanup, it also
   shrinks the size of struct request a bit.

 - Removal of mg_disk and hd (sorry Linus) by Christoph. The former was
   never used by platforms, and the latter has outlived it's usefulness.

 - Various little bug fixes and cleanups from a wide variety of folks.

* 'for-4.12/block' of git://git.kernel.dk/linux-block: (329 commits)
  block: hide badblocks attribute by default
  blk-mq: unify hctx delay_work and run_work
  block: add kblock_mod_delayed_work_on()
  blk-mq: unify hctx delayed_run_work and run_work
  nbd: fix use after free on module unload
  MAINTAINERS: bfq: Add Paolo as maintainer for the BFQ I/O scheduler
  blk-mq-sched: alloate reserved tags out of normal pool
  mtip32xx: use runtime tag to initialize command header
  scsi: Implement blk_mq_ops.show_rq()
  blk-mq: Add blk_mq_ops.show_rq()
  blk-mq: Show operation, cmd_flags and rq_flags names
  blk-mq: Make blk_flags_show() callers append a newline character
  blk-mq: Move the "state" debugfs attribute one level down
  blk-mq: Unregister debugfs attributes earlier
  blk-mq: Only unregister hctxs for which registration succeeded
  blk-mq-debugfs: Rename functions for registering and unregistering the mq directory
  blk-mq: Let blk_mq_debugfs_register() look up the queue name
  blk-mq: Register <dev>/queue/mq after having registered <dev>/queue
  ide-pm: always pass 0 error to ide_complete_rq in ide_do_devset
  ide-pm: always pass 0 error to __blk_end_request_all
  ..
2017-05-01 10:39:57 -07:00
Jan Kara 99edd4580b ubifs: Convert to separately allocated bdi
Allocate struct backing_dev_info separately instead of embedding it
inside the superblock. This unifies handling of bdi among users.

CC: Richard Weinberger <richard@nod.at>
CC: Artem Bityutskiy <dedekind1@gmail.com>
CC: Adrian Hunter <adrian.hunter@intel.com>
CC: linux-mtd@lists.infradead.org
Acked-by: Richard Weinberger <richard@nod.at>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-20 12:09:55 -06:00
Richard Weinberger 32fe905c17 ubifs: Fix O_TMPFILE corner case in ubifs_link()
It is perfectly fine to link a tmpfile back using linkat().
Since tmpfiles are created with a link count of 0 they appear
on the orphan list, upon re-linking the inode has to be removed
from the orphan list again.

Ralph faced a filesystem corruption in combination with overlayfs
due to this bug.

Cc: <stable@vger.kernel.org>
Cc: Ralph Sennhauser <ralph.sennhauser@gmail.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Reported-by: Ralph Sennhauser <ralph.sennhauser@gmail.com>
Tested-by: Ralph Sennhauser <ralph.sennhauser@gmail.com>
Reported-by: Amir Goldstein <amir73il@gmail.com>
Fixes: 474b93704f ("ubifs: Implement O_TMPFILE")
Signed-off-by: Richard Weinberger <richard@nod.at>
2017-04-18 23:18:02 +02:00
Felix Fietkau c3d9fda688 ubifs: Fix RENAME_WHITEOUT support
Remove faulty leftover check in do_rename(), apparently introduced in a
merge that combined whiteout support changes with commit f03b8ad8d3
("fs: support RENAME_NOREPLACE for local filesystems")

Fixes: f03b8ad8d3 ("fs: support RENAME_NOREPLACE for local
filesystems")
Fixes: 9e0a1fff8d ("ubifs: Implement RENAME_WHITEOUT")
Cc: stable@vger.kernel.org
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Richard Weinberger <richard@nod.at>
2017-03-30 09:28:02 +02:00
Hyunchul Lee 33fda9fa9f ubifs: Fix debug messages for an invalid filename in ubifs_dump_inode
instead of filenames, print inode numbers, file types, and length.

Signed-off-by: Hyunchul Lee <cheol.lee@lge.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2017-03-30 09:27:53 +02:00
Hyunchul Lee e328379a18 ubifs: Fix debug messages for an invalid filename in ubifs_dump_node
if a character is not printable, print '?' instead of that.

Signed-off-by: Hyunchul Lee <cheol.lee@lge.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2017-03-30 09:27:47 +02:00
Hyunchul Lee b20e2d9999 ubifs: Remove filename from debug messages in ubifs_readdir
if filename is encrypted, filename could have no printable characters.
so remove it.

Signed-off-by: Hyunchul Lee <cheol.lee@lge.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2017-03-30 09:27:28 +02:00
Richard Weinberger 63ed657362 ubifs: Fix memory leak in error path in ubifs_mknod
When fscrypt_setup_filename() fails we have to free dev.

Signed-off-by: Richard Weinberger <richard@nod.at>
2017-03-30 09:27:27 +02:00
David Howells a528d35e8b statx: Add a system call to make enhanced file info available
Add a system call to make extended file information available, including
file creation and some attribute flags where available through the
underlying filesystem.

The getattr inode operation is altered to take two additional arguments: a
u32 request_mask and an unsigned int flags that indicate the
synchronisation mode.  This change is propagated to the vfs_getattr*()
function.

Functions like vfs_stat() are now inline wrappers around new functions
vfs_statx() and vfs_statx_fd() to reduce stack usage.

========
OVERVIEW
========

The idea was initially proposed as a set of xattrs that could be retrieved
with getxattr(), but the general preference proved to be for a new syscall
with an extended stat structure.

A number of requests were gathered for features to be included.  The
following have been included:

 (1) Make the fields a consistent size on all arches and make them large.

 (2) Spare space, request flags and information flags are provided for
     future expansion.

 (3) Better support for the y2038 problem [Arnd Bergmann] (tv_sec is an
     __s64).

 (4) Creation time: The SMB protocol carries the creation time, which could
     be exported by Samba, which will in turn help CIFS make use of
     FS-Cache as that can be used for coherency data (stx_btime).

     This is also specified in NFSv4 as a recommended attribute and could
     be exported by NFSD [Steve French].

 (5) Lightweight stat: Ask for just those details of interest, and allow a
     netfs (such as NFS) to approximate anything not of interest, possibly
     without going to the server [Trond Myklebust, Ulrich Drepper, Andreas
     Dilger] (AT_STATX_DONT_SYNC).

 (6) Heavyweight stat: Force a netfs to go to the server, even if it thinks
     its cached attributes are up to date [Trond Myklebust]
     (AT_STATX_FORCE_SYNC).

And the following have been left out for future extension:

 (7) Data version number: Could be used by userspace NFS servers [Aneesh
     Kumar].

     Can also be used to modify fill_post_wcc() in NFSD which retrieves
     i_version directly, but has just called vfs_getattr().  It could get
     it from the kstat struct if it used vfs_xgetattr() instead.

     (There's disagreement on the exact semantics of a single field, since
     not all filesystems do this the same way).

 (8) BSD stat compatibility: Including more fields from the BSD stat such
     as creation time (st_btime) and inode generation number (st_gen)
     [Jeremy Allison, Bernd Schubert].

 (9) Inode generation number: Useful for FUSE and userspace NFS servers
     [Bernd Schubert].

     (This was asked for but later deemed unnecessary with the
     open-by-handle capability available and caused disagreement as to
     whether it's a security hole or not).

(10) Extra coherency data may be useful in making backups [Andreas Dilger].

     (No particular data were offered, but things like last backup
     timestamp, the data version number and the DOS archive bit would come
     into this category).

(11) Allow the filesystem to indicate what it can/cannot provide: A
     filesystem can now say it doesn't support a standard stat feature if
     that isn't available, so if, for instance, inode numbers or UIDs don't
     exist or are fabricated locally...

     (This requires a separate system call - I have an fsinfo() call idea
     for this).

(12) Store a 16-byte volume ID in the superblock that can be returned in
     struct xstat [Steve French].

     (Deferred to fsinfo).

(13) Include granularity fields in the time data to indicate the
     granularity of each of the times (NFSv4 time_delta) [Steve French].

     (Deferred to fsinfo).

(14) FS_IOC_GETFLAGS value.  These could be translated to BSD's st_flags.
     Note that the Linux IOC flags are a mess and filesystems such as Ext4
     define flags that aren't in linux/fs.h, so translation in the kernel
     may be a necessity (or, possibly, we provide the filesystem type too).

     (Some attributes are made available in stx_attributes, but the general
     feeling was that the IOC flags were to ext[234]-specific and shouldn't
     be exposed through statx this way).

(15) Mask of features available on file (eg: ACLs, seclabel) [Brad Boyer,
     Michael Kerrisk].

     (Deferred, probably to fsinfo.  Finding out if there's an ACL or
     seclabal might require extra filesystem operations).

(16) Femtosecond-resolution timestamps [Dave Chinner].

     (A __reserved field has been left in the statx_timestamp struct for
     this - if there proves to be a need).

(17) A set multiple attributes syscall to go with this.

===============
NEW SYSTEM CALL
===============

The new system call is:

	int ret = statx(int dfd,
			const char *filename,
			unsigned int flags,
			unsigned int mask,
			struct statx *buffer);

The dfd, filename and flags parameters indicate the file to query, in a
similar way to fstatat().  There is no equivalent of lstat() as that can be
emulated with statx() by passing AT_SYMLINK_NOFOLLOW in flags.  There is
also no equivalent of fstat() as that can be emulated by passing a NULL
filename to statx() with the fd of interest in dfd.

Whether or not statx() synchronises the attributes with the backing store
can be controlled by OR'ing a value into the flags argument (this typically
only affects network filesystems):

 (1) AT_STATX_SYNC_AS_STAT tells statx() to behave as stat() does in this
     respect.

 (2) AT_STATX_FORCE_SYNC will require a network filesystem to synchronise
     its attributes with the server - which might require data writeback to
     occur to get the timestamps correct.

 (3) AT_STATX_DONT_SYNC will suppress synchronisation with the server in a
     network filesystem.  The resulting values should be considered
     approximate.

mask is a bitmask indicating the fields in struct statx that are of
interest to the caller.  The user should set this to STATX_BASIC_STATS to
get the basic set returned by stat().  It should be noted that asking for
more information may entail extra I/O operations.

buffer points to the destination for the data.  This must be 256 bytes in
size.

======================
MAIN ATTRIBUTES RECORD
======================

The following structures are defined in which to return the main attribute
set:

	struct statx_timestamp {
		__s64	tv_sec;
		__s32	tv_nsec;
		__s32	__reserved;
	};

	struct statx {
		__u32	stx_mask;
		__u32	stx_blksize;
		__u64	stx_attributes;
		__u32	stx_nlink;
		__u32	stx_uid;
		__u32	stx_gid;
		__u16	stx_mode;
		__u16	__spare0[1];
		__u64	stx_ino;
		__u64	stx_size;
		__u64	stx_blocks;
		__u64	__spare1[1];
		struct statx_timestamp	stx_atime;
		struct statx_timestamp	stx_btime;
		struct statx_timestamp	stx_ctime;
		struct statx_timestamp	stx_mtime;
		__u32	stx_rdev_major;
		__u32	stx_rdev_minor;
		__u32	stx_dev_major;
		__u32	stx_dev_minor;
		__u64	__spare2[14];
	};

The defined bits in request_mask and stx_mask are:

	STATX_TYPE		Want/got stx_mode & S_IFMT
	STATX_MODE		Want/got stx_mode & ~S_IFMT
	STATX_NLINK		Want/got stx_nlink
	STATX_UID		Want/got stx_uid
	STATX_GID		Want/got stx_gid
	STATX_ATIME		Want/got stx_atime{,_ns}
	STATX_MTIME		Want/got stx_mtime{,_ns}
	STATX_CTIME		Want/got stx_ctime{,_ns}
	STATX_INO		Want/got stx_ino
	STATX_SIZE		Want/got stx_size
	STATX_BLOCKS		Want/got stx_blocks
	STATX_BASIC_STATS	[The stuff in the normal stat struct]
	STATX_BTIME		Want/got stx_btime{,_ns}
	STATX_ALL		[All currently available stuff]

stx_btime is the file creation time, stx_mask is a bitmask indicating the
data provided and __spares*[] are where as-yet undefined fields can be
placed.

Time fields are structures with separate seconds and nanoseconds fields
plus a reserved field in case we want to add even finer resolution.  Note
that times will be negative if before 1970; in such a case, the nanosecond
fields will also be negative if not zero.

The bits defined in the stx_attributes field convey information about a
file, how it is accessed, where it is and what it does.  The following
attributes map to FS_*_FL flags and are the same numerical value:

	STATX_ATTR_COMPRESSED		File is compressed by the fs
	STATX_ATTR_IMMUTABLE		File is marked immutable
	STATX_ATTR_APPEND		File is append-only
	STATX_ATTR_NODUMP		File is not to be dumped
	STATX_ATTR_ENCRYPTED		File requires key to decrypt in fs

Within the kernel, the supported flags are listed by:

	KSTAT_ATTR_FS_IOC_FLAGS

[Are any other IOC flags of sufficient general interest to be exposed
through this interface?]

New flags include:

	STATX_ATTR_AUTOMOUNT		Object is an automount trigger

These are for the use of GUI tools that might want to mark files specially,
depending on what they are.

Fields in struct statx come in a number of classes:

 (0) stx_dev_*, stx_blksize.

     These are local system information and are always available.

 (1) stx_mode, stx_nlinks, stx_uid, stx_gid, stx_[amc]time, stx_ino,
     stx_size, stx_blocks.

     These will be returned whether the caller asks for them or not.  The
     corresponding bits in stx_mask will be set to indicate whether they
     actually have valid values.

     If the caller didn't ask for them, then they may be approximated.  For
     example, NFS won't waste any time updating them from the server,
     unless as a byproduct of updating something requested.

     If the values don't actually exist for the underlying object (such as
     UID or GID on a DOS file), then the bit won't be set in the stx_mask,
     even if the caller asked for the value.  In such a case, the returned
     value will be a fabrication.

     Note that there are instances where the type might not be valid, for
     instance Windows reparse points.

 (2) stx_rdev_*.

     This will be set only if stx_mode indicates we're looking at a
     blockdev or a chardev, otherwise will be 0.

 (3) stx_btime.

     Similar to (1), except this will be set to 0 if it doesn't exist.

=======
TESTING
=======

The following test program can be used to test the statx system call:

	samples/statx/test-statx.c

Just compile and run, passing it paths to the files you want to examine.
The file is built automatically if CONFIG_SAMPLES is enabled.

Here's some example output.  Firstly, an NFS directory that crosses to
another FSID.  Note that the AUTOMOUNT attribute is set because transiting
this directory will cause d_automount to be invoked by the VFS.

	[root@andromeda ~]# /tmp/test-statx -A /warthog/data
	statx(/warthog/data) = 0
	results=7ff
	  Size: 4096            Blocks: 8          IO Block: 1048576  directory
	Device: 00:26           Inode: 1703937     Links: 125
	Access: (3777/drwxrwxrwx)  Uid:     0   Gid:  4041
	Access: 2016-11-24 09:02:12.219699527+0000
	Modify: 2016-11-17 10:44:36.225653653+0000
	Change: 2016-11-17 10:44:36.225653653+0000
	Attributes: 0000000000001000 (-------- -------- -------- -------- -------- -------- ---m---- --------)

Secondly, the result of automounting on that directory.

	[root@andromeda ~]# /tmp/test-statx /warthog/data
	statx(/warthog/data) = 0
	results=7ff
	  Size: 4096            Blocks: 8          IO Block: 1048576  directory
	Device: 00:27           Inode: 2           Links: 125
	Access: (3777/drwxrwxrwx)  Uid:     0   Gid:  4041
	Access: 2016-11-24 09:02:12.219699527+0000
	Modify: 2016-11-17 10:44:36.225653653+0000
	Change: 2016-11-17 10:44:36.225653653+0000

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-03-02 20:51:15 -05:00
Dave Jiang 11bac80004 mm, fs: reduce fault, page_mkwrite, and pfn_mkwrite to take only vmf
->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to
take a vma and vmf parameter when the vma already resides in vmf.

Remove the vma parameter to simplify things.

[arnd@arndb.de: fix ARM build]
  Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de
Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:54 -08:00
Linus Torvalds 6c24337f22 Various cleanups for the file system encryption feature.
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAlirP6wACgkQ8vlZVpUN
 gaMwpQgApR67CxzlstxYjZpWPAqC8McJ2FBDX+mCOle5Vkc1WQDklwr0oCfQThTj
 eDSFRhNfIvyPh0DJ589PxBCsWOqN5h6Si7hD5ZinomVNI+IL89OytaU5EV2OpWaW
 iKdJgO9Tm8U7LuY6FOIoVdX57kUXVdkWoj61rC056B1SNhnNiVeofi7lYDM8Ix4q
 IGSQ9W24iQKmCk4hCwgObhJBRK9RnlOH0GLUmpMaS+jnfnj/uePwdxWEFsPuCOob
 8acAJ49lr55kjIw79E0BAyWxhEZ2aiArHk8PaWynT/DyNq3ftcapPlpftoeba8vo
 glBJRX70QxPvt0iHEp0ykfExkhWhFA==
 =Joki
 -----END PGP SIGNATURE-----

Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt

Pull fscrypt updates from Ted Ts'o:
 "Various cleanups for the file system encryption feature"

* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt:
  fscrypt: constify struct fscrypt_operations
  fscrypt: properly declare on-stack completion
  fscrypt: split supp and notsupp declarations into their own headers
  fscrypt: remove redundant assignment of res
  fscrypt: make fscrypt_operations.key_prefix a string
  fscrypt: remove unused 'mode' member of fscrypt_ctx
  ext4: don't allow encrypted operations without keys
  fscrypt: make test_dummy_encryption require a keyring key
  fscrypt: factor out bio specific functions
  fscrypt: pass up error codes from ->get_context()
  fscrypt: remove user-triggerable warning messages
  fscrypt: use EEXIST when file already uses different policy
  fscrypt: use ENOTDIR when setting encryption policy on nondirectory
  fscrypt: use ENOKEY when file cannot be created w/o key
2017-02-20 18:22:31 -08:00
Eric Biggers 6f69f0ed61 fscrypt: constify struct fscrypt_operations
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Richard Weinberger <richard@nod.at>
2017-02-08 10:59:57 -05:00
Eric Biggers 46f47e4800 fscrypt: split supp and notsupp declarations into their own headers
Previously, each filesystem configured without encryption support would
define all the public fscrypt functions to their notsupp_* stubs.  This
list of #defines had to be updated in every filesystem whenever a change
was made to the public fscrypt functions.  To make things more
maintainable now that we have three filesystems using fscrypt, split the
old header fscrypto.h into several new headers.  fscrypt_supp.h contains
the real declarations and is included by filesystems when configured
with encryption support, whereas fscrypt_notsupp.h contains the inline
stubs and is included by filesystems when configured without encryption
support.  fscrypt_common.h contains common declarations needed by both.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-02-06 23:26:43 -05:00
Richard Weinberger 1cb51a15b5 ubifs: Fix journal replay wrt. xattr nodes
When replaying the journal it can happen that a journal entry points to
a garbage collected node.
This is the case when a power-cut occurred between a garbage collect run
and a commit. In such a case nodes have to be read using the failable
read functions to detect whether the found node matches what we expect.

One corner case was forgotten, when the journal contains an entry to
remove an inode all xattrs have to be removed too. UBIFS models xattr
like directory entries, so the TNC code iterates over
all xattrs of the inode and removes them too. This code re-uses the
functions for walking directories and calls ubifs_tnc_next_ent().
ubifs_tnc_next_ent() expects to be used only after the journal and
aborts when a node does not match the expected result. This behavior can
render an UBIFS volume unmountable after a power-cut when xattrs are
used.

Fix this issue by using failable read functions in ubifs_tnc_next_ent()
too when replaying the journal.
Cc: stable@vger.kernel.org
Fixes: 1e51764a3c ("UBIFS: add new flash file system")
Reported-by: Rock Lee <rockdotlee@gmail.com>
Reviewed-by: David Gstir <david@sigma-star.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
2017-01-17 14:35:58 +01:00
Eric Biggers 3d4b2fcbc9 ubifs: remove redundant checks for encryption key
In several places, ubifs checked for an encryption key before creating a
file in an encrypted directory.  This was redundant with
fscrypt_setup_filename() or ubifs_new_inode(), and in the case of
ubifs_link() it broke linking to special files.  So remove the extra
checks.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2017-01-17 14:34:21 +01:00
Eric Biggers a75467d910 ubifs: allow encryption ioctls in compat mode
The ubifs encryption ioctls did not work when called by a 32-bit program
on a 64-bit kernel.  Since 'struct fscrypt_policy' is not affected by
the word size, ubifs just needs to allow these ioctls through, like what
ext4 and f2fs do.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2017-01-17 14:33:41 +01:00
Arnd Bergmann 404e0b6331 ubifs: add CONFIG_BLOCK dependency for encryption
This came up during the v4.10 merge window:

warning: (UBIFS_FS_ENCRYPTION) selects FS_ENCRYPTION which has unmet direct dependencies (BLOCK)
fs/crypto/crypto.c: In function 'fscrypt_zeroout_range':
fs/crypto/crypto.c:355:9: error: implicit declaration of function 'bio_alloc';did you mean 'd_alloc'? [-Werror=implicit-function-declaration]
   bio = bio_alloc(GFP_NOWAIT, 1);

The easiest way out is to limit UBIFS_FS_ENCRYPTION to configurations
that also enable BLOCK.

Fixes: d475a50745 ("ubifs: Add skeleton for fscrypto")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
2017-01-17 14:32:47 +01:00
Peter Rosin 507502adf0 ubifs: fix unencrypted journal write
Without this, I get the following on reboot:

UBIFS error (ubi1:0 pid 703): ubifs_load_znode: bad target node (type 1) length (8240)
UBIFS error (ubi1:0 pid 703): ubifs_load_znode: have to be in range of 48-4144
UBIFS error (ubi1:0 pid 703): ubifs_load_znode: bad indexing node at LEB 13:11080, error 5
 magic          0x6101831
 crc            0xb1cb246f
 node_type      9 (indexing node)
 group_type     0 (no node group)
 sqnum          546
 len            128
 child_cnt      5
 level          0
 Branches:
 0: LEB 14:72088 len 161 key (133, inode)
 1: LEB 14:81120 len 160 key (134, inode)
 2: LEB 20:26624 len 8240 key (134, data, 0)
 3: LEB 14:81280 len 160 key (135, inode)
 4: LEB 20:34864 len 8240 key (135, data, 0)
UBIFS warning (ubi1:0 pid 703): ubifs_ro_mode.part.0: switched to read-only mode, error -22
CPU: 0 PID: 703 Comm: mount Not tainted 4.9.0-next-20161213+ #1197
Hardware name: Atmel SAMA5
[<c010d2ac>] (unwind_backtrace) from [<c010b250>] (show_stack+0x10/0x14)
[<c010b250>] (show_stack) from [<c024df94>] (ubifs_jnl_update+0x2e8/0x614)
[<c024df94>] (ubifs_jnl_update) from [<c0254bf8>] (ubifs_mkdir+0x160/0x204)
[<c0254bf8>] (ubifs_mkdir) from [<c01a6030>] (vfs_mkdir+0xb0/0x104)
[<c01a6030>] (vfs_mkdir) from [<c0286070>] (ovl_create_real+0x118/0x248)
[<c0286070>] (ovl_create_real) from [<c0283ed4>] (ovl_fill_super+0x994/0xaf4)
[<c0283ed4>] (ovl_fill_super) from [<c019c394>] (mount_nodev+0x44/0x9c)
[<c019c394>] (mount_nodev) from [<c019c4ac>] (mount_fs+0x14/0xa4)
[<c019c4ac>] (mount_fs) from [<c01b5338>] (vfs_kern_mount+0x4c/0xd4)
[<c01b5338>] (vfs_kern_mount) from [<c01b6b80>] (do_mount+0x154/0xac8)
[<c01b6b80>] (do_mount) from [<c01b782c>] (SyS_mount+0x74/0x9c)
[<c01b782c>] (SyS_mount) from [<c0107f80>] (ret_fast_syscall+0x0/0x3c)
UBIFS error (ubi1:0 pid 703): ubifs_mkdir: cannot create directory, error -22
overlayfs: failed to create directory /mnt/ovl/work/work (errno: 22); mounting read-only

Fixes: 7799953b34 ("ubifs: Implement encrypt/decrypt for all IO")
Signed-off-by: Peter Rosin <peda@axentia.se>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2017-01-17 14:05:23 +01:00
Colin Ian King e8f19746e4 ubifs: ensure zero err is returned on successful return
err is no longer being set on a successful return path, causing
a garbage value being returned. Fix this by setting err to zero
for the successful return path.

Found with static analysis by CoverityScan, CID 1389473

Fixes: 7799953b34 ("ubifs: Implement encrypt/decrypt for all IO")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2017-01-17 13:57:32 +01:00
Eric Biggers a5d431eff2 fscrypt: make fscrypt_operations.key_prefix a string
There was an unnecessary amount of complexity around requesting the
filesystem-specific key prefix.  It was unclear why; perhaps it was
envisioned that different instances of the same filesystem type could
use different key prefixes, or that key prefixes could be binary.
However, neither of those things were implemented or really make sense
at all.  So simplify the code by making key_prefix a const char *.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-01-08 01:03:41 -05:00
Linus Torvalds 231753ef78 Merge uncontroversial parts of branch 'readlink' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull partial readlink cleanups from Miklos Szeredi.

This is the uncontroversial part of the readlink cleanup patch-set that
simplifies the default readlink handling.

Miklos and Al are still discussing the rest of the series.

* git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  vfs: make generic_readlink() static
  vfs: remove ".readlink = generic_readlink" assignments
  vfs: default to generic_readlink()
  vfs: replace calling i_op->readlink with vfs_readlink()
  proc/self: use generic_readlink
  ecryptfs: use vfs_get_link()
  bad_inode: add missing i_op initializers
2016-12-17 19:16:12 -08:00
Richard Weinberger ba75d570b6 ubifs: Initialize fstr_real_len
While fstr_real_len is only being used under if (encrypted),
gcc-6 still warns.

Fixes this false positive:
fs/ubifs/dir.c: In function 'ubifs_readdir':
fs/ubifs/dir.c:629:13: warning: 'fstr_real_len' may be used
uninitialized in this function [-Wmaybe-uninitialized]
    fstr.len = fstr_real_len

Initialize fstr_real_len to make gcc happy.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-14 17:39:21 +01:00
Richard Weinberger ec9160dacd ubifs: Use fscrypt ioctl() helpers
Commit db717d8e26 ("fscrypto: move ioctl processing more fully into
common code") moved ioctl() related functions into fscrypt and offers
us now a set of helper functions.

Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: David Gstir <david@sigma-star.at>
2016-12-13 19:54:52 +01:00
Richard Weinberger 3858866866 ubifs: Use FS_CFLG_OWN_PAGES
Commit bd7b829038 ("fscrypt: Cleanup page locking requirements for
fscrypt_{decrypt,encrypt}_page()") renamed the flag.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-13 16:18:16 +01:00
Richard Weinberger fc4b891bbe ubifs: Raise write version to 5
Starting with version 5 the following properties change:
 - UBIFS_FLG_DOUBLE_HASH is mandatory
 - UBIFS_FLG_ENCRYPTION is optional but depdens on UBIFS_FLG_DOUBLE_HASH
 - Filesystems with unknown super block flags will be rejected, this
   allows us in future to add new features without raising the UBIFS
   write version.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger e021986ee4 ubifs: Implement UBIFS_FLG_ENCRYPTION
This feature flag indicates that the filesystem contains encrypted
files.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger d63d61c169 ubifs: Implement UBIFS_FLG_DOUBLE_HASH
This feature flag indicates that all directory entry nodes have a 32bit
cookie set and therefore UBIFS is allowed to perform lookups by hash.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger cc41a53652 ubifs: Use a random number for cookies
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger 528e3d178f ubifs: Add full hash lookup support
UBIFS stores a 32bit hash of every file, for traditional lookups by name
this scheme is fine since UBIFS can first try to find the file by the
hash of the filename and upon collisions it can walk through all entries
with the same hash and do a string compare.
When filesnames are encrypted fscrypto will ask the filesystem for a
unique cookie, based on this cookie the filesystem has to be able to
locate the target file again. With 32bit hashes this is impossible
because the chance for collisions is very high. Do deal with that we
store a 32bit cookie directly in the UBIFS directory entry node such
that we get a 64bit cookie (32bit from filename hash and the dent
cookie). For a lookup by hash UBIFS finds the entry by the first 32bit
and then compares the dent cookie. If it does not match, it has to do a
linear search of the whole directory and compares all dent cookies until
the correct entry is found.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger b91dc9816e ubifs: Rename tnc_read_node_nm
tnc_read_hashed_node() is a better name since we read a node
by a given hash, not a name.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger ca7f85be8d ubifs: Add support for encrypted symlinks
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger f4f61d2cc6 ubifs: Implement encrypted filenames
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: David Gstir <david@sigma-star.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger b9bc8c7bdb ubifs: Make r5 hash binary string aware
As of now all filenames known by UBIFS are strings with a NUL
terminator. With encrypted filenames a filename can be any binary
string and the r5 function cannot search for the NUL terminator.
UBIFS always knows how long a filename is, therefore we can change
the hash function to iterate over the filename length to work
correctly with binary strings.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger 304790c038 ubifs: Relax checks in ubifs_validate_entry()
With encrypted filenames we store raw binary data, doing
string tests is no longer possible.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger 7799953b34 ubifs: Implement encrypt/decrypt for all IO
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: David Gstir <david@sigma-star.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger 1ee77870c9 ubifs: Constify struct inode pointer in ubifs_crypt_is_encrypted()
...and provide a non const variant for fscrypto

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger f1f52d6b02 ubifs: Introduce new data node field, compr_size
When data of a data node is compressed and encrypted
we need to store the size of the compressed data because
before encryption we may have to add padding bytes.

For the new field we consume the last two padding bytes
in struct ubifs_data_node. Two bytes are fine because
the data length is at most 4096.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger 959c2de2b3 ubifs: Enforce crypto policy in mmap
We need this extra check in mmap because a process could
gain an already opened fd.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger 700eada82a ubifs: Massage assert in ubifs_xattr_set() wrt. fscrypto
When we're creating a new inode in UBIFS the inode is not
yet exposed and fscrypto calls ubifs_xattr_set() without
holding the inode mutex. This is okay but ubifs_xattr_set()
has to know about this.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger 9270b2f4cd ubifs: Preload crypto context in ->lookup()
...and mark the dentry as encrypted.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger ac7e47a9ed ubifs: Enforce crypto policy in ->link and ->rename
When a file is moved or linked into another directory
its current crypto policy has to be compatible with the
target policy.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger a79bff21c1 ubifs: Implement file open operation
We need ->open() for files to load the crypto key.
If the no key is present and the file is encrypted,
refuse to open.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger ba40e6a3c4 ubifs: Implement directory open operation
We need the ->open() hook to load the crypto context
which is needed for all crypto operations within that
directory.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger 43b113fea2 ubifs: Massage ubifs_listxattr() for encryption context
We have to make sure that we don't expose our internal
crypto context to userspace.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger d475a50745 ubifs: Add skeleton for fscrypto
This is the first building block to provide file level
encryption on UBIFS.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger 6a5e98ab7d ubifs: Define UBIFS crypto context xattr
Like ext4 UBIFS will store the crypto context in a xattr
attribute.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger ade46c3a60 ubifs: Export xattr get and set functions
For fscrypto we need this function outside of xattr.c.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger f6337d8426 ubifs: Export ubifs_check_dir_empty()
fscrypto will need this function too. Also get struct ubifs_info
from the provided inode. Not all callers will have a reference to
struct ubifs_info.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Christophe Jaillet d40a796217 ubifs: Remove some dead code
'ubifs_fast_find_freeable()' can not return an error pointer, so this test
can be removed.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:06:28 +01:00
Rafał Miłecki 1b7fc2c006 ubifs: Use dirty_writeback_interval value for wbuf timer
Right now wbuf timer has hardcoded timeouts and there is no place for
manual adjustments. Some projects / cases many need that though. Few
file systems allow doing that by respecting dirty_writeback_interval
that can be set using sysctl (dirty_writeback_centisecs).

Lowering dirty_writeback_interval could be some way of dealing with user
space apps lacking proper fsyncs. This is definitely *not* a perfect
solution but we don't have ideal (user space) world. There were already
advanced discussions on this matter, mostly when ext4 was introduced and
it wasn't behaving as ext3. Anyway, the final decision was to add some
hacks to the ext4, as trying to fix whole user space or adding new API
was pointless.

We can't (and shouldn't?) just follow ext4. We can't e.g. sync on close
as this would cause too many commits and flash wearing. On the other
hand we still should allow some trade-off between -o sync and default
wbuf timeout. Respecting dirty_writeback_interval should allow some sane
cutomizations if used warily.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:06:24 +01:00
Rafał Miłecki 854826c9d5 ubifs: Drop softlimit and delta fields from struct ubifs_wbuf
Values of these fields are set during init and never modified. They are
used (read) in a single function only. There isn't really any reason to
keep them in a struct. It only makes struct just a bit bigger without
any visible gain.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:06:11 +01:00
Miklos Szeredi dfeef68862 vfs: remove ".readlink = generic_readlink" assignments
If .readlink == NULL implies generic_readlink().

Generated by:

to_del="\.readlink.*=.*generic_readlink"
for i in `git grep -l $to_del`; do sed -i "/$to_del"/d $i; done

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-12-09 16:45:04 +01:00
Richard Weinberger a00052a296 ubifs: Fix regression in ubifs_readdir()
Commit c83ed4c9db ("ubifs: Abort readdir upon error") broke
overlayfs support because the fix exposed an internal error
code to VFS.

Reported-by: Peter Rosin <peda@axentia.se>
Tested-by: Peter Rosin <peda@axentia.se>
Reported-by: Ralph Sennhauser <ralph.sennhauser@gmail.com>
Tested-by: Ralph Sennhauser <ralph.sennhauser@gmail.com>
Fixes: c83ed4c9db ("ubifs: Abort readdir upon error")
Cc: stable@vger.kernel.org
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-28 14:48:31 +02:00
Richard Weinberger c83ed4c9db ubifs: Abort readdir upon error
If UBIFS is facing an error while walking a directory, it reports this
error and ubifs_readdir() returns the error code. But the VFS readdir
logic does not make the getdents system call fail in all cases. When the
readdir cursor indicates that more entries are present, the system call
will just return and the libc wrapper will try again since it also
knows that more entries are present.
This causes the libc wrapper to busy loop for ever when a directory is
corrupted on UBIFS.
A common approach do deal with corrupted directory entries is
skipping them by setting the cursor to the next entry. On UBIFS this
approach is not possible since we cannot compute the next directory
entry cursor position without reading the current entry. So all we can
do is setting the cursor to the "no more entries" position and make
getdents exit.

Cc: stable@vger.kernel.org
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-20 00:06:11 +02:00
Richard Weinberger 843741c577 ubifs: Fix xattr_names length in exit paths
When the operation fails we also have to undo the changes
we made to ->xattr_names. Otherwise listxattr() will report
wrong lengths.

Cc: stable@vger.kernel.org
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-20 00:05:54 +02:00
Richard Weinberger 390975ac39 ubifs: Rename ubifs_rename2
Since ->rename2 is gone, rename ubifs_rename2() to ubifs_rename().

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-20 00:05:47 +02:00
Linus Torvalds 4c609922a3 This pull request contains:
* Fixes for both UBI and UBIFS
 * overlayfs support (O_TMPFILE, RENAME_WHITEOUT/EXCHANGE)
 * Code refactoring for the upcoming MLC support
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJX/QOCAAoJEEtJtSqsAOnWtp4QAKItkx/LrW44rHhkoJfqG62i
 o+OaxMKNu43/v/io+68JNEkIqgEap2vMZVkfoIgIyuyPxMG7nA/zG3c2JFvQ/ReS
 uH0PmcpkIXbRBKe9IEn6rXmRz9q9UTNGhP2U5kg0rL22vwVGYIuzF4Bny25Irzf/
 LLtYOkpfZfaNTSjs1pmuJMWVFF1Rj68eVJEWL6JZ1BPQ4bRPbn5sNgOKNTJYkrJs
 GcXXNtonf3B0zOzFnmfFhVO5neo4FEG3QEQafR+qbhoNBvXSluVIAFoO4VKEcyHD
 BJbotsT64TBsBj7ol97EXxz+N6LkB3tNM3bFBvhAFXZ+EvrJ0o+2QoEOH0igWjMI
 4AXwSl6htCs+wRmqAqpJfZpfI7kv2MDUB9ZGAbuXRS888OK78Dzt1CupPW7Q12xh
 yYMNsXZvRvK82n0DfqBLQ53SIe/L3PotG2Cc29hjGaHjK+YcwVRvdp/2B3ID3O2L
 6ap/M6KA+i1SiYZI6yAEYT76jKOam9YG/psb76q66xILJ7h5XQOZODYQ9zC2towo
 Pjb+bCPzHZPm+v7xtSsP6aanZ+5xRXO91JjvsWl9UOQVDCA/Jt98H5qhCJZjIeIs
 OJ7z9PbTv0/jcBBRrjJyZIUE85omDliY4h04B3Yu44xa7Q9e7wbE+Vs/6L9txS0e
 L8TBNHmrYB7ZIprCIhcE
 =UB7l
 -----END PGP SIGNATURE-----

Merge tag 'upstream-4.9-rc1' of git://git.infradead.org/linux-ubifs

Pull UBI/UBIFS updates from Richard Weinberger:
 "This pull request contains:

   - Fixes for both UBI and UBIFS
   - overlayfs support (O_TMPFILE, RENAME_WHITEOUT/EXCHANGE)
   - Code refactoring for the upcoming MLC support"

[ Ugh, we just got rid of the "rename2()" naming for the extended rename
  functionality. And this re-introduces it in ubifs with the cross-
  renaming and whiteout support.

  But rather than do any re-organizations in the merge itself, the
  naming can be cleaned up later ]

* tag 'upstream-4.9-rc1' of git://git.infradead.org/linux-ubifs: (27 commits)
  UBIFS: improve function-level documentation
  ubifs: fix host xattr_len when changing xattr
  ubifs: Use move variable in ubifs_rename()
  ubifs: Implement RENAME_EXCHANGE
  ubifs: Implement RENAME_WHITEOUT
  ubifs: Implement O_TMPFILE
  ubi: Fix Fastmap's update_vol()
  ubi: Fix races around ubi_refill_pools()
  ubi: Deal with interrupted erasures in WL
  UBI: introduce the VID buffer concept
  UBI: hide EBA internals
  UBI: provide an helper to query LEB information
  UBI: provide an helper to check whether a LEB is mapped or not
  UBI: add an helper to check lnum validity
  UBI: simplify LEB write and atomic LEB change code
  UBI: simplify recover_peb() code
  UBI: move the global ech and vidh variables into struct ubi_attach_info
  UBI: provide helpers to allocate and free aeb elements
  UBI: fastmap: use ubi_io_{read, write}_data() instead of ubi_io_{read, write}()
  UBI: fastmap: use ubi_rb_for_each_entry() in unmap_peb()
  ...
2016-10-11 10:49:44 -07:00
Linus Torvalds 101105b171 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
 ">rename2() work from Miklos + current_time() from Deepa"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: Replace current_fs_time() with current_time()
  fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
  fs: Replace CURRENT_TIME with current_time() for inode timestamps
  fs: proc: Delete inode time initializations in proc_alloc_inode()
  vfs: Add current_time() api
  vfs: add note about i_op->rename changes to porting
  fs: rename "rename2" i_op to "rename"
  vfs: remove unused i_op->rename
  fs: make remaining filesystems use .rename2
  libfs: support RENAME_NOREPLACE in simple_rename()
  fs: support RENAME_NOREPLACE for local filesystems
  ncpfs: fix unused variable warning
2016-10-10 20:16:43 -07:00
Linus Torvalds 97d2116708 Merge branch 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs xattr updates from Al Viro:
 "xattr stuff from Andreas

  This completes the switch to xattr_handler ->get()/->set() from
  ->getxattr/->setxattr/->removexattr"

* 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  vfs: Remove {get,set,remove}xattr inode operations
  xattr: Stop calling {get,set,remove}xattr inode operations
  vfs: Check for the IOP_XATTR flag in listxattr
  xattr: Add __vfs_{get,set,remove}xattr helpers
  libfs: Use IOP_XATTR flag for empty directory handling
  vfs: Use IOP_XATTR flag for bad-inode handling
  vfs: Add IOP_XATTR inode operations flag
  vfs: Move xattr_resolve_name to the front of fs/xattr.c
  ecryptfs: Switch to generic xattr handlers
  sockfs: Get rid of getxattr iop
  sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names
  kernfs: Switch to generic xattr handlers
  hfs: Switch to generic xattr handlers
  jffs2: Remove jffs2_{get,set,remove}xattr macros
  xattr: Remove unnecessary NULL attribute name check
2016-10-10 17:11:50 -07:00
Al Viro e55f1d1d13 Merge remote-tracking branch 'jk/vfs' into work.misc 2016-10-08 11:06:08 -04:00
Andreas Gruenbacher fd50ecaddf vfs: Remove {get,set,remove}xattr inode operations
These inode operations are no longer used; remove them.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-07 21:48:36 -04:00
Julia Lawall ec037dfcc0 UBIFS: improve function-level documentation
Fix various inconsistencies in the documentation associated with various
functions.

In the case of fs/ubifs/lprops.c, the second parameter of
ubifs_get_lp_stats was renamed from st to lst in commit 84abf972cc
("UBIFS: add re-mount debugging checks")

In the case of fs/ubifs/lpt_commit.c, the excess variables have never
existed in the associated functions since the code was introduced into the
kernel.

The others appear to be straightforward typos.

Issues detected using Coccinelle (http://coccinelle.lip6.fr/)

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:55:02 +02:00
Pascal Eberhard 74e9c700bc ubifs: fix host xattr_len when changing xattr
When an extended attribute is changed, xattr_len of host inode is
recalculated. ui->data_len is updated before computation and result
is wrong. This patch adds a temporary variable to fix computation.

To reproduce the issue:

~# > a.txt
~# attr -s an-attr -V a-value a.txt
~# attr -s an-attr -V a-bit-bigger-value a.txt

Now host inode xattr_len is wrong. Forcing dbg_check_filesystem()
generates the following error:

[  130.620140] UBIFS (ubi0:2): background thread "ubifs_bgt0_2" started, PID 565
[  131.470790] UBIFS error (ubi0:2 pid 564): check_inodes: inode 646 has xattr size 240, but calculated size is 256
[  131.481697] UBIFS (ubi0:2): dump of the inode 646 sitting in LEB 29:114688
[  131.488953]  magic          0x6101831
[  131.492876]  crc            0x9fce9091
[  131.496836]  node_type      0 (inode node)
[  131.501193]  group_type     1 (in node group)
[  131.505788]  sqnum          9278
[  131.509191]  len            160
[  131.512549]  key            (646, inode)
[  131.516688]  creat_sqnum    9270
[  131.520133]  size           0
[  131.523264]  nlink          1
[  131.526398]  atime          1053025857.0
[  131.530574]  mtime          1053025857.0
[  131.534714]  ctime          1053025906.0
[  131.538849]  uid            0
[  131.542009]  gid            0
[  131.545140]  mode           33188
[  131.548636]  flags          0x1
[  131.551977]  xattr_cnt      1
[  131.555108]  xattr_size     240
[  131.558420]  xattr_names    12
[  131.561670]  compr_type     0x1
[  131.564983]  data len       0
[  131.568125] UBIFS error (ubi0:2 pid 564): dbg_check_filesystem: file-system check failed with error -22
[  131.578074] CPU: 0 PID: 564 Comm: mount Not tainted 4.4.12-g3639bea54a #24
[  131.585352] Hardware name: Generic AM33XX (Flattened Device Tree)
[  131.591918] [<c00151c0>] (unwind_backtrace) from [<c0012acc>] (show_stack+0x10/0x14)
[  131.600177] [<c0012acc>] (show_stack) from [<c01c950c>] (dbg_check_filesystem+0x464/0x4d0)
[  131.608934] [<c01c950c>] (dbg_check_filesystem) from [<c019f36c>] (ubifs_mount+0x14f8/0x2130)
[  131.617991] [<c019f36c>] (ubifs_mount) from [<c00d7088>] (mount_fs+0x14/0x98)
[  131.625572] [<c00d7088>] (mount_fs) from [<c00ed674>] (vfs_kern_mount+0x4c/0xd4)
[  131.633435] [<c00ed674>] (vfs_kern_mount) from [<c00efb5c>] (do_mount+0x988/0xb50)
[  131.641471] [<c00efb5c>] (do_mount) from [<c00f004c>] (SyS_mount+0x74/0xa0)
[  131.648837] [<c00f004c>] (SyS_mount) from [<c000fe20>] (ret_fast_syscall+0x0/0x3c)
[  131.665315] UBIFS (ubi0:2): background thread "ubifs_bgt0_2" stops

Signed-off-by: Pascal Eberhard <pascal.eberhard@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:55:02 +02:00
Richard Weinberger 1e03953388 ubifs: Use move variable in ubifs_rename()
...to make the code more consistent since we use
move already in other places.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:55:02 +02:00
Richard Weinberger 9ec64962af ubifs: Implement RENAME_EXCHANGE
Adds RENAME_EXCHANGE to UBIFS, the operation itself
is completely disjunct from a regular rename() that's
why we dispatch very early in ubifs_reaname().

RENAME_EXCHANGE used by the renameat2() system call
allows the caller to exchange two paths atomically.
Both paths have to exist and have to be on the same
filesystem.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:55:02 +02:00
Richard Weinberger 9e0a1fff8d ubifs: Implement RENAME_WHITEOUT
Adds RENAME_WHITEOUT support to UBIFS, we implement
it in the same way as ext4 and xfs do.
For an overview of other ways to implement it please
refere to commit 7dcf5c3e45 ("xfs: add RENAME_WHITEOUT support").

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:55:02 +02:00
Richard Weinberger 474b93704f ubifs: Implement O_TMPFILE
This patchs adds O_TMPFILE support to UBIFS.
A temp file is a reference to an unlinked inode, a user
holding the reference can use it. As soon it is being closed
all data vanishes.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:55:02 +02:00
Miklos Szeredi 2773bf00ae fs: rename "rename2" i_op to "rename"
Generated patch:

sed -i "s/\.rename2\t/\.rename\t\t/" `git grep -wl rename2`
sed -i "s/\brename2\b/rename/g" `git grep -wl rename2`

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-09-27 11:03:58 +02:00
Miklos Szeredi f03b8ad8d3 fs: support RENAME_NOREPLACE for local filesystems
This is trivial to do:

 - add flags argument to foo_rename()
 - check if flags doesn't have any other than RENAME_NOREPLACE
 - assign foo_rename() to .rename2 instead of .rename

Filesystems converted:

affs, bfs, exofs, ext2, hfs, hfsplus, jffs2, jfs, logfs, minix, msdos,
nilfs2, omfs, reiserfs, sysvfs, ubifs, udf, ufs, vfat.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Acked-by: Boaz Harrosh <ooo@electrozaur.com>
Acked-by: Richard Weinberger <richard@nod.at>
Acked-by: Bob Copeland <me@bobcopeland.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Dave Kleikamp <shaggy@kernel.org>
Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Christoph Hellwig <hch@infradead.org>
2016-09-27 11:03:57 +02:00
Jan Kara 31051c85b5 fs: Give dentry to inode_change_ok() instead of inode
inode_change_ok() will be resposible for clearing capabilities and IMA
extended attributes and as such will need dentry. Give it as an argument
to inode_change_ok() instead of an inode. Also rename inode_change_ok()
to setattr_prepare() to better relect that it does also some
modifications in addition to checks.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2016-09-22 10:56:19 +02:00
Richard Weinberger 17ce1eb0b6 ubifs: Fix xattr generic handler usage
UBIFS uses full names to work with xattrs, therefore we have to use
xattr_full_name() to obtain the xattr prefix as string.

Cc: <stable@vger.kernel.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Fixes: 2b88fc21ca ("ubifs: Switch to generic xattr handlers")
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
Tested-by: Dongsheng Yang <dongsheng081251@gmail.com>
2016-08-23 23:02:52 +02:00
Vincent Stehlé c0082e985f ubifs: Fix assertion in layout_in_gaps()
An assertion in layout_in_gaps() verifies that the gap_lebs pointer is
below the maximum bound. When computing this maximum bound the idx_lebs
count is multiplied by sizeof(int), while C pointers arithmetic does take
into account the size of the pointed elements implicitly already. Remove
the multiplication to fix the assertion.

Fixes: 1e51764a3c ("UBIFS: add new flash file system")
Cc: <stable@vger.kernel.org>
Signed-off-by: Vincent Stehlé <vincent.stehle@intel.com>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-08-23 23:02:40 +02:00
Sylvain Etienne 13cd091364 ubifs: switch_gc_head: Remove redondant sync of wbuf
The wbuf is already sync-ed before ubifs_leb_unmap()

Signed-off-by: Sylvain Etienne <seti@dadboo.eu>
Acked-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-07-29 23:32:37 +02:00
Daniel Golle dccbc9197d ubifs: Silence early error messages if MS_SILENT is set
Probe-mounting a volume too small for UBIFS results in kernel log
polution which might irritate users.
Address this by silencing errors which may happen during boot if the
rootfs is e.g. squashfs (and thus rather small) stored on a UBI volume.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-07-29 23:30:36 +02:00
Daniel Golle 380bc8b710 ubifs: Update comment for ubifs_errc
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-07-29 23:30:26 +02:00
Ben Dooks dfaf8d2aec ubifs: Make xattr structures static
Fix sparse warnings from the use of "struct xattr_handler"
structures that are not exported by making them static. Fixes
the following sparse warnings:

/fs/ubifs/xattr.c:595:28: warning: symbol 'ubifs_user_xattr_handler' was not declared. Should it be static?
/fs/ubifs/xattr.c:601:28: warning: symbol 'ubifs_trusted_xattr_handler' was not declared. Should it be static?
/fs/ubifs/xattr.c:607:28: warning: symbol 'ubifs_security_xattr_handler' was not declared. Should it be static?

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-07-29 16:19:43 +02:00
Daniel Golle 1ae92642e5 ubifs: Silence error output if MS_SILENT is set
This change completes commit
90bea5a3f0 ("UBIFS: respect MS_SILENT mount flag")
which already implements support for MS_SILENT except for that one
error message which is still being displayed despite MS_SILENT being
set. Suppress that error message as well in case MS_SILENT is set.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
[rw: massaged commit message]
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-07-29 16:17:50 +02:00
Kirill A. Shutemov 4ac1c17b20 UBIFS: Implement ->migratepage()
During page migrations UBIFS might get confused
and the following assert triggers:
[  213.480000] UBIFS assert failed in ubifs_set_page_dirty at 1451 (pid 436)
[  213.490000] CPU: 0 PID: 436 Comm: drm-stress-test Not tainted 4.4.4-00176-geaa802524636-dirty #1008
[  213.490000] Hardware name: Allwinner sun4i/sun5i Families
[  213.490000] [<c0015e70>] (unwind_backtrace) from [<c0012cdc>] (show_stack+0x10/0x14)
[  213.490000] [<c0012cdc>] (show_stack) from [<c02ad834>] (dump_stack+0x8c/0xa0)
[  213.490000] [<c02ad834>] (dump_stack) from [<c0236ee8>] (ubifs_set_page_dirty+0x44/0x50)
[  213.490000] [<c0236ee8>] (ubifs_set_page_dirty) from [<c00fa0bc>] (try_to_unmap_one+0x10c/0x3a8)
[  213.490000] [<c00fa0bc>] (try_to_unmap_one) from [<c00fadb4>] (rmap_walk+0xb4/0x290)
[  213.490000] [<c00fadb4>] (rmap_walk) from [<c00fb1bc>] (try_to_unmap+0x64/0x80)
[  213.490000] [<c00fb1bc>] (try_to_unmap) from [<c010dc28>] (migrate_pages+0x328/0x7a0)
[  213.490000] [<c010dc28>] (migrate_pages) from [<c00d0cb0>] (alloc_contig_range+0x168/0x2f4)
[  213.490000] [<c00d0cb0>] (alloc_contig_range) from [<c010ec00>] (cma_alloc+0x170/0x2c0)
[  213.490000] [<c010ec00>] (cma_alloc) from [<c001a958>] (__alloc_from_contiguous+0x38/0xd8)
[  213.490000] [<c001a958>] (__alloc_from_contiguous) from [<c001ad44>] (__dma_alloc+0x23c/0x274)
[  213.490000] [<c001ad44>] (__dma_alloc) from [<c001ae08>] (arm_dma_alloc+0x54/0x5c)
[  213.490000] [<c001ae08>] (arm_dma_alloc) from [<c035cecc>] (drm_gem_cma_create+0xb8/0xf0)
[  213.490000] [<c035cecc>] (drm_gem_cma_create) from [<c035cf20>] (drm_gem_cma_create_with_handle+0x1c/0xe8)
[  213.490000] [<c035cf20>] (drm_gem_cma_create_with_handle) from [<c035d088>] (drm_gem_cma_dumb_create+0x3c/0x48)
[  213.490000] [<c035d088>] (drm_gem_cma_dumb_create) from [<c0341ed8>] (drm_ioctl+0x12c/0x444)
[  213.490000] [<c0341ed8>] (drm_ioctl) from [<c0121adc>] (do_vfs_ioctl+0x3f4/0x614)
[  213.490000] [<c0121adc>] (do_vfs_ioctl) from [<c0121d30>] (SyS_ioctl+0x34/0x5c)
[  213.490000] [<c0121d30>] (SyS_ioctl) from [<c000f2c0>] (ret_fast_syscall+0x0/0x34)

UBIFS is using PagePrivate() which can have different meanings across
filesystems. Therefore the generic page migration code cannot handle this
case correctly.
We have to implement our own migration function which basically does a
plain copy but also duplicates the page private flag.
UBIFS is not a block device filesystem and cannot use buffer_migrate_page().

Cc: stable@vger.kernel.org
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
[rw: Massaged changelog, build fixes, etc...]
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Christoph Hellwig <hch@lst.de>
2016-06-23 00:29:53 +02:00
Linus Torvalds 23a3e178b9 This pull request contains mostly cleanups and minor
improvements of UBI and UBIFS.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJXSJl6AAoJEEtJtSqsAOnWbbAP/2ls5KpGsRSoOP4NsKo5+8k9
 GsZ8hb0iN+UGDxMB5Jlj6O1ISNPz/8o7iGBuWa5OWFdh4Nn38sA1Qv996Rg3Ca3O
 8KYEGAD7POWANfxCTmyQX/L8AsOP62B3diFktPyGrtYmLsVzx/AFN9/nM27ticdF
 PQpUtLwvL/m7/oE6ymFCB4x34+DkTa7Jx4e+chVlF61CUipGc5I60VVX1Do+/DWt
 S37ajjIxMJx0BO7Zs6vrcH9uGFSbvSpVBr9/sC16t0Fa1/vAa9pXieg3y5XZmtOl
 +6NlwUxXVu5IjHkYYGZP8jsrA7bKJlflgCo/w6WKC6V0KECnBx+oSG9uMjDVPBSM
 rc4VF4nevnNLNqFGBRWhxiYrRUCdB1TcWVDOzM16peNeUNJiWj/+4PbsiJQBwECH
 ZnE/dBrjU7v+J8UmHsUJSveW6/5PluJf1loDzrip7gk5QD13wdpPo3VOycOj1vMD
 iS6H/TBjtcEzWWh7qcBbtgkbHQH0g+w9YSzVy7ODFweuoq+4qCZ1davqzzZaRv3q
 hylor1gea3b4GnH1EDjmXyQOOuIf2rLdJhou8WoZhmy2q3FrnE+89/CCOCMkeQMQ
 qQr1lAbxD7A+8bYwSZ7Nvv6lrBUrat59DV6FHFc69MdkGT7J6z85ZOyd3lxRvJc/
 XiZZkKhrYSKgV/iLFhHW
 =L16y
 -----END PGP SIGNATURE-----

Merge tag 'upstream-4.7-rc1' of git://git.infradead.org/linux-ubifs

Pull UBI/UBIFS updates from Richard Weinberger:
 "This contains mostly cleanups and minor improvements of UBI and UBIFS"

* tag 'upstream-4.7-rc1' of git://git.infradead.org/linux-ubifs:
  ubifs: ubifs_dump_inode: Fix dumping field bulk_read
  UBI: Fix static volume checks when Fastmap is used
  UBI: Set free_count to zero before walking through erase list
  UBI: Silence an unintialized variable warning
  UBI: Clean up return in ubi_remove_volume()
  UBI: Modify wrong comment in ubi_leb_map function.
  UBI: Don't read back all data in ubi_eba_copy_leb()
  UBI: Add ro-mode sysfs attribute
2016-05-27 18:49:29 -07:00
Al Viro 5930122683 switch xattr_handler->set() to passing dentry and inode separately
preparation for similar switch in ->setxattr() (see the next commit for
rationale).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-27 15:39:43 -04:00
Andreas Gruenbacher 1112018cef ubifs: ubifs_dump_inode: Fix dumping field bulk_read
The wrong field (xattr) is dumped here due to a copy-and-paste error.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-05-24 15:29:44 +02:00
Andy Shevchenko 8da4b8c48e lib/uuid.c: move generate_random_uuid() to uuid.c
Let's gather the UUID related functions under one hood.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Linus Torvalds ba5a2655c2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull remaining vfs xattr work from Al Viro:
 "The rest of work.xattr (non-cifs conversions)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  btrfs: Switch to generic xattr handlers
  ubifs: Switch to generic xattr handlers
  jfs: Switch to generic xattr handlers
  jfs: Clean up xattr name mapping
  gfs2: Switch to generic xattr handlers
  ceph: kill __ceph_removexattr()
  ceph: Switch to generic xattr handlers
  ceph: Get rid of d_find_alias in ceph_set_acl
2016-05-18 10:08:45 -07:00
Andreas Gruenbacher 2b88fc21ca ubifs: Switch to generic xattr handlers
Ubifs internally uses special inodes for storing xattrs. Those inodes
had NULL {get,set,remove}xattr inode operations before this change, so
xattr operations on them would fail. The super block's s_xattr field
would also apply to those special inodes. However, the inodes are not
visible outside of ubifs, and so no xattr operations will ever be
carried out on them anyway.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-17 19:16:23 -04:00
Al Viro c51da20c48 more trivial ->iterate_shared conversions
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-09 11:41:14 -04:00
Al Viro 84695ffee7 Merge getxattr prototype change into work.lookups
The rest of work.xattr stuff isn't needed for this branch
2016-05-02 19:45:47 -04:00
Al Viro ce23e64013 ->getxattr(): pass dentry and inode as separate arguments
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-11 00:48:00 -04:00
Kirill A. Shutemov ea1754a084 mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage
Mostly direct substitution with occasional adjustment or removing
outdated comments.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-04 10:41:08 -07:00
Kirill A. Shutemov 09cbfeaf1a mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.

This promise never materialized.  And unlikely will.

We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE.  And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.

Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.

Let's stop pretending that pages in page cache are special.  They are
not.

The changes are pretty straight-forward:

 - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

 - page_cache_get() -> get_page();

 - page_cache_release() -> put_page();

This patch contains automated changes generated with coccinelle using
script below.  For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.

The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.

There are few places in the code where coccinelle didn't reach.  I'll
fix them manually in a separate patch.  Comments and documentation also
will be addressed with the separate patch.

virtual patch

@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT

@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE

@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK

@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)

@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)

@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-04 10:41:08 -07:00
Andreas Gruenbacher c27cb97218 ubifs: Remove unused header
UBIFS does not support POSIX ACLs, so there is no need for including any
POSIX ACL hesders.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-03-20 21:37:46 +01:00
Joe Perches 3e7f2c5104 ubifs: Add logging functions for ubifs_msg, ubifs_err and ubifs_warn
The existing logging macros are fairly large and converting the
macros to functions make the object code smaller.

Use %pV and __builtin_return_address(0) as appropriate.

$ size fs/ubifs/built-in.o*
   text	   data	    bss	    dec	    hex	filename
 575831	 309688	 161312	1046831	  ff92f	fs/ubifs/built-in.o.allyesconfig.new
 622457	 312872	 161120	1096449	 10bb01	fs/ubifs/built-in.o.allyesconfig.old
 223785	    640	    644	 225069	  36f2d	fs/ubifs/built-in.o.defconfig.new
 251873	    640	    644	 253157	  3dce5	fs/ubifs/built-in.o.defconfig.old

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-03-20 21:36:05 +01:00
Al Viro 5955102c99 wrappers for ->i_mutex access
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).

Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-22 18:04:28 -05:00
Vladimir Davydov 5d097056c9 kmemcg: account certain kmem allocations to memcg
Mark those kmem allocations that are known to be easily triggered from
userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
memcg.  For the list, see below:

 - threadinfo
 - task_struct
 - task_delay_info
 - pid
 - cred
 - mm_struct
 - vm_area_struct and vm_region (nommu)
 - anon_vma and anon_vma_chain
 - signal_struct
 - sighand_struct
 - fs_struct
 - files_struct
 - fdtable and fdtable->full_fds_bits
 - dentry and external_name
 - inode for all filesystems. This is the most tedious part, because
   most filesystems overwrite the alloc_inode method.

The list is far from complete, so feel free to add more objects.
Nevertheless, it should be close to "account everything" approach and
keep most workloads within bounds.  Malevolent users will be able to
breach the limit, but this was possible even with the former "account
everything" approach (simply because it did not account everything in
fact).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14 16:00:49 -08:00
Linus Torvalds 60b7eca1dc This pull request contains three changes, two cleanups and one UBI
wear leveling improvement by Sebastian Siewior.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJWlWXmAAoJEEtJtSqsAOnW93QQAK8Dn68PxIn6qQFeah0Lg2bI
 a9sycpSHPvQniv7npgcFEh+9iDUPcvMMHVshc9qBPjENzW1VozL/5cZd8Utt5Ua/
 xU7kYRS1uXQLeL8HnASCrsVS4GrcHFJBx4+72XBom15/DsKNBuSa/Z5BkXL1fkOK
 VqBcHQK5FhJLe4I+Crwu7CbM5kc6CbnuyDNDkKrTLLKZZhNI/GormCi3+TmO2hK8
 gOt2z1DyvxyyFM7109WkSERW/9oEG4JDCy34/aAS6HY8YmBMFh9+cb8VU7IRmZD1
 YLS+u0ltwheoNuLs6/9zOEKqwUjY2wgsJk3dUkT05OQ4MNN/5AFSSKkU/kp7XAgp
 5yzrAuA+FGmPiEcfX8HrHYDN9aLctRIOibirYsy6eexwCiSfcWGfJf9FoZj522cG
 WgL7y2eS8ODOE2PzLSD2d756G0ka+tuFeQ9BhC3cfbxygy/cTcXaFPNRp8sJ8Biv
 hXQQss62L5Pn/vtlOxGLQAhWDHB4TG34NQ+1/Ni+4LB6pTHOfoD4yomPCxJqFXGE
 Pcz2rHb+mErWNesdI8J7MlXUbR5gRIE+lpZZvhzvRSlKw+CRT7OYDat1p7WXE4Qm
 FtuX+pSV86N3Mx3vTF6QyucrRANKMNr8M9VhPH8nZACZlfooIk3ax9eSjNEzJ1Be
 cN3vsR2y6d9U9q2wk+OX
 =Zouw
 -----END PGP SIGNATURE-----

Merge tag 'upstream-4.5-rc1' of git://git.infradead.org/linux-ubifs

Pull UBI/UBIFS updates from Richard Weinberger:
 "This contains three changes - two cleanups and one UBI wear leveling
  improvement by Sebastian Siewior"

* tag 'upstream-4.5-rc1' of git://git.infradead.org/linux-ubifs:
  ubifs: Use XATTR_*_PREFIX_LEN
  UBIFS: add a comment in key.h for unused parameter
  mtd: ubi: wl: avoid erasing a PEB which is empty
2016-01-12 18:20:20 -08:00
Richard Weinberger 4fdd1d51ad ubifs: Use XATTR_*_PREFIX_LEN
...instead of open coding it.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-01-10 12:33:47 +01:00
Dongsheng Yang 170eb55f7d UBIFS: add a comment in key.h for unused parameter
Add a comment in key.h to explain why we keep an unused
parameter in key helpers.

Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-01-10 12:33:30 +01:00
Al Viro 6b2553918d replace ->follow_link() with new method that could stay in RCU mode
new method: ->get_link(); replacement of ->follow_link().  The differences
are:
	* inode and dentry are passed separately
	* might be called both in RCU and non-RCU mode;
the former is indicated by passing it a NULL dentry.
	* when called that way it isn't allowed to block
and should return ERR_PTR(-ECHILD) if it needs to be called
in non-RCU mode.

It's a flagday change - the old method is gone, all in-tree instances
converted.  Conversion isn't hard; said that, so far very few instances
do not immediately bail out when called in RCU mode.  That'll change
in the next commits.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-08 22:41:54 -05:00
Linus Torvalds 5d2eb548b3 Merge branch 'for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs xattr cleanups from Al Viro.

* 'for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  f2fs: xattr simplifications
  squashfs: xattr simplifications
  9p: xattr simplifications
  xattr handlers: Pass handler to operations instead of flags
  jffs2: Add missing capability check for listing trusted xattrs
  hfsplus: Remove unused xattr handler list operations
  ubifs: Remove unused security xattr handler
  vfs: Fix the posix_acl_xattr_list return value
  vfs: Check attribute names in posix acl xattr handers
2015-11-13 18:02:30 -08:00
Andreas Gruenbacher 13d3408f10 ubifs: Remove unused security xattr handler
Ubifs installs a security xattr handler in sb->s_xattr but doesn't use the
generic_{get,set,list,remove}xattr inode operations needed for processing
this list of attribute handlers; the handler is never called.  Instead,
ubifs uses its own xattr handlers which also process security xattrs.

Remove the dead code.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: linux-mtd@lists.infradead.org
Cc: Subodh Nijsure <snijsure@grid-net.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-13 20:34:29 -05:00
Linus Torvalds 01504f5e9e This pull request includes the following UBI/UBIFS changes:
* access time support for UBIFS by Dongsheng Yang
 * random cleanups and bug fixes all over the place
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJWQmnVAAoJEEtJtSqsAOnWpnkP/0pUN+Cx58QJoJXxaSgClH/l
 QwYGznz4Fv3QuSafXJZy8CPzsTasaOOtoVKJPfPTvP8Drrl3FFbtdhqlqdx1YKqw
 DuKeViwmE/HF49a5pxgtR8032l35s3zZUWwd2yKA3B6SVEIsjAoWlIutyzKA/DiR
 R4mGVsfRevfzq1+l2YXGGZ8PJXMtRKnFTjzmIUJY0OeT3xCvl/uUFCid2x693aut
 GpJ6AMKGYcU+BuQJmb89wvdRYCHPdYEJr7TWq+GBxU3gDPlLY9302IhPq9ftv8Lp
 ClA0X3GZqVPR921XR+aA4ysRbfENT+UBewiBt2OEwQLnI6jxEn9EqsgErATxpOYy
 PEduYuU56mqc67EE4r7+YHfkidMSa8KrR9s+GpzR6JStW9HNGFDfGfCAK7BcdvWT
 +TuHfNk787Gbl4TO41iFI87s6zNQszYOpxBTEFYHx0FQWLNxD5c79UkisL+biqvJ
 bX8+7pClJQXcY0CXqe56GVQLC5v4HbvMmnstiNk52ihxEB0NT6htzWe/GPb/+UZ1
 K+kOAT54DPeag3FK4J1WZ57cYyJHdbyASB5vKvb+A1IwzrYFo8P35GekoWNK7eAJ
 Od5mb++7pTEFSH7oSwsprPOmrHLRmXgRG8yhLRfqX1ZwFeyFdqGQxjce9O/QB1jo
 jr4dxAY6Gnjdo1q07QSI
 =9n2S
 -----END PGP SIGNATURE-----

Merge tag 'upstream-4.4-rc1' of git://git.infradead.org/linux-ubifs

Pull UBI/UBIFS updates from Richard Weinberger:

 - access time support for UBIFS by Dongsheng Yang

 - random cleanups and bug fixes all over the place

* tag 'upstream-4.4-rc1' of git://git.infradead.org/linux-ubifs:
  ubifs: introduce UBIFS_ATIME_SUPPORT to ubifs
  ubifs: make ubifs_[get|set]xattr atomic
  UBIFS: Delete unnecessary checks before the function call "iput"
  UBI: Remove in vain semicolon
  UBI: Fastmap: Fix PEB array type
  UBIFS: Fix possible memory leak in ubifs_readdir()
  fs/ubifs: remove unnecessary new_valid_dev check
  ubi: fastmap: Implement produce_free_peb()
  UBIFS: print verbose message when rescanning a corrupted node
  UBIFS: call dbg_is_power_cut() instead of reading c->dbg->pc_happened
  UBI: drop null test before destroy functions
  UBI: Update comments to reflect UBI_METAONLY flag
  UBI: Fix debug message
  UBI: Fix typo in comment
  UBI: Fastmap: Simplify expression
  UBIFS: fix a typo in comment of ubifs_budget_req
  UBIFS: use kmemdup rather than duplicating its implementation
2015-11-10 16:35:06 -08:00
Dongsheng Yang 8c1c5f2638 ubifs: introduce UBIFS_ATIME_SUPPORT to ubifs
To make ubifs support atime flexily, this commit introduces
a Kconfig option named as UBIFS_ATIME_SUPPORT.

With UBIFS_ATIME_SUPPORT=n:
	ubifs keeps the full compatibility to no_atime from
the start of ubifs.

=================UBIFS_ATIME_SUPPORT=n=======================
-o - no atime
-o atime - no atime
-o noatime - no atime
-o relatime - no atime
-o strictatime - no atime
-o lazyatime - no atime

With UBIFS_ATIME_SUPPORT=y:
	ubifs supports the atime same with other main stream
file systems.
=================UBIFS_ATIME_SUPPORT=y=======================
-o - default behavior (relatime currently)
-o atime - atime support
-o noatime - no atime support
-o relatime - relative atime support
-o strictatime - strict atime support
-o lazyatime - lazy atime support

Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-11-07 11:35:08 +01:00
Dongsheng Yang ab92a20bce ubifs: make ubifs_[get|set]xattr atomic
This commit make the ubifs_[get|set]xattr protected by ui_mutex.

Originally, there is a possibility that ubifs_getxattr to get
a wrong value.

  P1                                  P2
----------                  	----------
ubifs_getxattr                      ubifs_setxattr
					- kfree()
- memcpy()
					- kmemdup()

Then ubifs_getxattr() would get a non-sense data. To solve this
problem, this commit make the xattr of ubifs_inode updated in
atomic.

Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-11-07 11:33:17 +01:00
Markus Elfring 54bcfdf19e UBIFS: Delete unnecessary checks before the function call "iput"
The iput() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-11-06 23:26:52 +01:00
Richard Weinberger aeeb14f763 UBIFS: Fix possible memory leak in ubifs_readdir()
If ubifs_tnc_next_ent() returns something else than -ENOENT
we leak file->private_data.

Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: David Gstir <david@sigma-star.at>
2015-11-06 23:26:49 +01:00
Yaowei Bai 86ba9ed928 fs/ubifs: remove unnecessary new_valid_dev check
As currently new_valid_dev always returns 1, so new_valid_dev check is not
needed, remove it.

Signed-off-by: Yaowei Bai <bywxiaobai@163.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-11-06 23:26:48 +01:00
shengyong be186cc4de UBIFS: print verbose message when rescanning a corrupted node
This is a trivial fix of showing verbose message when leb-recovery detects
a corrupted node, which is not the last one in the LEB. Rescan expects to
show more detail of the corrupted node.

Reviewed-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-10-03 20:42:01 +02:00
shengyong 8f6983abe3 UBIFS: call dbg_is_power_cut() instead of reading c->dbg->pc_happened
Call dbg_is_power_cut() to emulate power cut instead of reading
c->dbg->pc_happened. Otherwise, the function becomes dead code.

Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-10-03 20:40:21 +02:00
Dongsheng Yang 7d25b361a2 UBIFS: fix a typo in comment of ubifs_budget_req
s/now/how

Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2015-10-03 20:03:25 +02:00
Andrzej Hajda bbc8a0044f UBIFS: use kmemdup rather than duplicating its implementation
The patch was generated using fixed coccinelle semantic patch
scripts/coccinelle/api/memdup.cocci [1].

[1]: http://permalink.gmane.org/gmane.linux.kernel/2014320

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2015-10-03 20:03:14 +02:00
Richard Weinberger cf6f54e3f1 UBIFS: Kill unneeded locking in ubifs_init_security
Fixes the following lockdep splat:
[    1.244527] =============================================
[    1.245193] [ INFO: possible recursive locking detected ]
[    1.245193] 4.2.0-rc1+ #37 Not tainted
[    1.245193] ---------------------------------------------
[    1.245193] cp/742 is trying to acquire lock:
[    1.245193]  (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [<ffffffff812b3f69>] ubifs_init_security+0x29/0xb0
[    1.245193]
[    1.245193] but task is already holding lock:
[    1.245193]  (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [<ffffffff81198e7f>] path_openat+0x3af/0x1280
[    1.245193]
[    1.245193] other info that might help us debug this:
[    1.245193]  Possible unsafe locking scenario:
[    1.245193]
[    1.245193]        CPU0
[    1.245193]        ----
[    1.245193]   lock(&sb->s_type->i_mutex_key#9);
[    1.245193]   lock(&sb->s_type->i_mutex_key#9);
[    1.245193]
[    1.245193]  *** DEADLOCK ***
[    1.245193]
[    1.245193]  May be due to missing lock nesting notation
[    1.245193]
[    1.245193] 2 locks held by cp/742:
[    1.245193]  #0:  (sb_writers#5){.+.+.+}, at: [<ffffffff811ad37f>] mnt_want_write+0x1f/0x50
[    1.245193]  #1:  (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [<ffffffff81198e7f>] path_openat+0x3af/0x1280
[    1.245193]
[    1.245193] stack backtrace:
[    1.245193] CPU: 2 PID: 742 Comm: cp Not tainted 4.2.0-rc1+ #37
[    1.245193] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140816_022509-build35 04/01/2014
[    1.245193]  ffffffff8252d530 ffff88007b023a38 ffffffff814f6f49 ffffffff810b56c5
[    1.245193]  ffff88007c30cc80 ffff88007b023af8 ffffffff810a150d ffff88007b023a68
[    1.245193]  000000008101302a ffff880000000000 00000008f447e23f ffffffff8252d500
[    1.245193] Call Trace:
[    1.245193]  [<ffffffff814f6f49>] dump_stack+0x4c/0x65
[    1.245193]  [<ffffffff810b56c5>] ? console_unlock+0x1c5/0x510
[    1.245193]  [<ffffffff810a150d>] __lock_acquire+0x1a6d/0x1ea0
[    1.245193]  [<ffffffff8109fa78>] ? __lock_is_held+0x58/0x80
[    1.245193]  [<ffffffff810a1a93>] lock_acquire+0xd3/0x270
[    1.245193]  [<ffffffff812b3f69>] ? ubifs_init_security+0x29/0xb0
[    1.245193]  [<ffffffff814fc83b>] mutex_lock_nested+0x6b/0x3a0
[    1.245193]  [<ffffffff812b3f69>] ? ubifs_init_security+0x29/0xb0
[    1.245193]  [<ffffffff812b3f69>] ? ubifs_init_security+0x29/0xb0
[    1.245193]  [<ffffffff812b3f69>] ubifs_init_security+0x29/0xb0
[    1.245193]  [<ffffffff8128e286>] ubifs_create+0xa6/0x1f0
[    1.245193]  [<ffffffff81198e7f>] ? path_openat+0x3af/0x1280
[    1.245193]  [<ffffffff81195d15>] vfs_create+0x95/0xc0
[    1.245193]  [<ffffffff8119929c>] path_openat+0x7cc/0x1280
[    1.245193]  [<ffffffff8109ffe3>] ? __lock_acquire+0x543/0x1ea0
[    1.245193]  [<ffffffff81088f20>] ? sched_clock_cpu+0x90/0xc0
[    1.245193]  [<ffffffff81088c00>] ? calc_global_load_tick+0x60/0x90
[    1.245193]  [<ffffffff81088f20>] ? sched_clock_cpu+0x90/0xc0
[    1.245193]  [<ffffffff811a9cef>] ? __alloc_fd+0xaf/0x180
[    1.245193]  [<ffffffff8119ac55>] do_filp_open+0x75/0xd0
[    1.245193]  [<ffffffff814ffd86>] ? _raw_spin_unlock+0x26/0x40
[    1.245193]  [<ffffffff811a9cef>] ? __alloc_fd+0xaf/0x180
[    1.245193]  [<ffffffff81189bd9>] do_sys_open+0x129/0x200
[    1.245193]  [<ffffffff81189cc9>] SyS_open+0x19/0x20
[    1.245193]  [<ffffffff81500717>] entry_SYSCALL_64_fastpath+0x12/0x6f

While the lockdep splat is a false positive, becuase path_openat holds i_mutex
of the parent directory and ubifs_init_security() tries to acquire i_mutex
of a new inode, it reveals that taking i_mutex in ubifs_init_security() is
in vain because it is only being called in the inode allocation path
and therefore nobody else can see the inode yet.

Cc: stable@vger.kernel.org # 3.20-
Reported-and-tested-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-and-tested-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: dedekind1@gmail.com
2015-09-29 12:45:42 +02:00
Linus Torvalds cc8a0a9439 This pull request includes the following UBI/UBIFS changes:
* Minor fixes for UBI and UBIFS
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJVi9DTAAoJEPmeIjfg4UaaOlAP/AtZqcxpCHlKpbZUz7z1MmCb
 MP9AbowP0CSXrMYSA7GrBROw+bfTM7aZrhWqhhoJxAyBbb9LGldjW9xZmUqUvS5e
 LZjAO8sWqxEbfVBqnpTgA2AiSN37I3+epYqgDFNlNCtxPYxkEXpXVoLUSdrAUKw8
 24EjPe0tq1JKLmjh3XJQZJnmCW+7xPKRUj7KiwPZzombIoNXFA9z4PCOLLlnDx1X
 TRasArrncarslQaHQerRoZUKC/hoIGZMipv94pNovgO8HAcZ/N19ef1Fc4iG5zvS
 XUMerUyKMTVfR6LFjueMjRpEMOmDb4AifWeDKY00xqY4RCnkIcAAqVRBGWQep6Y4
 PxCwttNktykWj/LygPCsQk6gd0IxdVGsFra4DpxhCCWXaUw2RJCRg0a4mQFOqojO
 tmCVQVYgd4Ob2DDnxmS7Z8IvjIWzsSkgatF+Dde2feeHUwEuM9NoudHg704+A30x
 3fNwIFhuREhOymxGpBaprfaKh7khF88CcrGjiEOvw/DK0QfwlVoUVLqjkNVaLZXC
 nrwNATdheq2gMV2Mdydy/uo5NpUxeblSrlnuknKQSTmH5m2/U9F9lJ/v3Sv7ShjB
 gWqyNIoxJBKTSbsGkFP5bzWmxwsQ4HlPTcAmJhoUDK7SLeXThhgRwB4o2zlriYfS
 D7ZVxRMY5JCNxomIzD0X
 =xU0H
 -----END PGP SIGNATURE-----

Merge tag 'upstream-4.2-rc1' of git://git.infradead.org/linux-ubifs

Pull UBI/UBIFS updates from Richard Weinberger:
 "Minor fixes for UBI and UBIFS"

* tag 'upstream-4.2-rc1' of git://git.infradead.org/linux-ubifs:
  UBI: Remove unnecessary `\'
  UBI: Use static class and attribute groups
  UBI: add a helper function for updatting on-flash layout volumes
  UBI: Fastmap: Do not add vol if it already exists
  UBI: Init vol->reserved_pebs by assignment
  UBI: Fastmap: Rename variables to make them meaningful
  UBI: Fastmap: Remove unnecessary `\'
  UBI: Fastmap: Use max() to get the larger value
  ubifs: fix to check error code of register_shrinker
  UBI: block: Dynamically allocate minor numbers
2015-06-25 14:11:34 -07:00
Chao Yu a1fe33af5f ubifs: fix to check error code of register_shrinker
register_shrinker() in ubifs_init() can fail due to fail to call kzalloc.
This patch fixes to check the return value of register_shrinker, otherwise
our shrinker may be unregistered after ubifs initialized successfully.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-06-02 11:41:38 +02:00
Al Viro 0f301bd305 ubifs: switch to simple_follow_link()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-10 22:18:25 -04:00
Linus Torvalds 9ec3a646fe Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull fourth vfs update from Al Viro:
 "d_inode() annotations from David Howells (sat in for-next since before
  the beginning of merge window) + four assorted fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  RCU pathwalk breakage when running into a symlink overmounting something
  fix I_DIO_WAKEUP definition
  direct-io: only inc/dec inode->i_dio_count for file systems
  fs/9p: fix readdir()
  VFS: assorted d_backing_inode() annotations
  VFS: fs/inode.c helpers: d_inode() annotations
  VFS: fs/cachefiles: d_backing_inode() annotations
  VFS: fs library helpers: d_inode() annotations
  VFS: assorted weird filesystems: d_inode() annotations
  VFS: normal filesystems (and lustre): d_inode() annotations
  VFS: security/: d_inode() annotations
  VFS: security/: d_backing_inode() annotations
  VFS: net/: d_inode() annotations
  VFS: net/unix: d_backing_inode() annotations
  VFS: kernel/: d_inode() annotations
  VFS: audit: d_backing_inode() annotations
  VFS: Fix up some ->d_inode accesses in the chelsio driver
  VFS: Cachefiles should perform fs modifications on the top layer only
  VFS: AF_UNIX sockets should call mknod on the top layer only
2015-04-26 17:22:07 -07:00
Linus Torvalds d613896926 This pull request includes the following UBI/UBIFS changes:
* Powercut emulation for UBI
 * A huge update to UBI Fastmap
 * Cleanups and bugfixes all over UBI and UBIFS
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJVLsBfAAoJEEtJtSqsAOnWyaYP/30jnSEu4y2agDPwdev0wnqF
 QRAAVWxLCzinBkElXJ0POvhdQwnxP37ACqa/mNVuTa5++N1eI8PPpcRu+EUrVmoP
 mSRmEBdarvNYRLmOaoBBwY4qFvmcZx9Odt2BIdXz9xV+Mk220VdRW2PBn8JfGDn4
 JJziWKmq9HUKXMQY9SVkOfT5tW+xHdjRUuNwp2S5nuV4hz2rUv0ChlYqCBqvca8/
 tfjtlCLBGGuFSXfa9wtdCTWlU2Hyh4sG9u72c23iGqHmwOc9ZIpBlgiz+nYZFXoo
 GriBDBXG5ib1p+Px+oecWPBLv/Ipgc887IJHFAz3exPmSkQQTnuCSYFsraS1zqwZ
 91kOGjcF5wY9EwBbT0a1VDdOwYT9MC5gD7G71MarDt0BWEYDvPufd6XPc5qPfP/9
 xtscNQZ2FNd4gWGRT5irBOqIERdsWt2cjgD0aK2gbV1p9meUBU6hlvMIOU1YmBKg
 TwK8o+RwhE13JWwSExqcW+FRz9Aup/W/EDRNuuUGOXFFSnSvqSwE/y3JeP7jydIf
 7J3rvrKdRWObTO/uZvtx+mwU3FgNPoITCgQQtAz7jexjksLWMxVi1gF1VYULScW/
 p5ewCFsuRQSeCL/M1cKO1trYARbe9YITewbrQKuL8wknQKGcXY0XiV0OpPlH0xXD
 LqnvNoeixq4AH7rkMQIk
 =nzVw
 -----END PGP SIGNATURE-----

Merge tag 'upstream-4.1-rc1' of git://git.infradead.org/linux-ubifs

Pull UBI/UBIFS updates from Richard Weinberger:
 "This pull request includes the following UBI/UBIFS changes:

   - powercut emulation for UBI
   - a huge update to UBI Fastmap
   - cleanups and bugfixes all over UBI and UBIFS"

* tag 'upstream-4.1-rc1' of git://git.infradead.org/linux-ubifs: (50 commits)
  UBI: power cut emulation for testing
  UBIFS: fix output format of INUM_WATERMARK
  UBI: Fastmap: Fall back to scanning mode after ECC error
  UBI: Fastmap: Remove is_fm_block()
  UBI: Fastmap: Add blank line after declarations
  UBI: Fastmap: Remove else after return.
  UBI: Fastmap: Introduce may_reserve_for_fm()
  UBI: Fastmap: Introduce ubi_fastmap_init()
  UBI: Fastmap: Wire up WL accessor functions
  UBI: Add accessor functions for WL data structures
  UBI: Move fastmap specific functions out of wl.c
  UBI: Fastmap: Add new module parameter fm_debug
  UBI: Fastmap: Make self_check_eba() depend on fastmap self checking
  UBI: Fastmap: Add self check to detect absent PEBs
  UBI: Fix stale pointers in ubi->lookuptbl
  UBI: Fastmap: Enhance fastmap checking
  UBI: Add initial support for fastmap self checks
  UBI: Fastmap: Rework fastmap error paths
  UBI: Fastmap: Prepare for variable sized fastmaps
  UBI: Fastmap: Locking updates
  ...
2015-04-15 13:43:40 -07:00
David Howells 2b0143b5c9 VFS: normal filesystems (and lustre): d_inode() annotations
that's the bulk of filesystem drivers dealing with inodes of their own

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-15 15:06:57 -04:00
Sheng Yong 1a7e985dd1 UBIFS: fix output format of INUM_WATERMARK
The INUM_WATERMARK is a unsigned 32bit value, `%d' prints it as negatave:
[  103.682255] UBIFS warning (ubi0:0 pid 691): ubifs_new_inode: running out of inode numbers (current 122763, max -256)

Fix it as:
[  154.422940] UBIFS warning (ubi0:0 pid 688): ubifs_new_inode: running out of inode numbers (current 122765, max 4294967040)

Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2015-04-13 13:37:31 +03:00
Al Viro 5d5d568975 make new_sync_{read,write}() static
All places outside of core VFS that checked ->read and ->write for being NULL or
called the methods directly are gone now, so NULL {read,write} with non-NULL
{read,write}_iter will do the right thing in all cases.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:29:40 -04:00
Christoph Hellwig e2e40f2c1e fs: move struct kiocb to fs.h
struct kiocb now is a generic I/O container, so move it to fs.h.
Also do a #include diet for aio.h while we're at it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-03-25 20:28:11 -04:00
Sheng Yong 235c362bd0 UBIFS: extend debug/message capabilities
In the case where we have more than one volumes on different UBI
devices, it may be not that easy to tell which volume prints the
messages.  Add ubi number and volume id in ubifs_msg/warn/error
to help debug. These two values are passed by struct ubifs_info.

For those where ubifs_info is not initialized yet, ubifs_* is
replaced by pr_*. For those where ubifs_info is not avaliable,
ubifs_info is passed to the calling function as a const parameter.

The output looks like,

[   95.444879] UBIFS (ubi0:1): background thread "ubifs_bgt0_1" started, PID 696
[   95.484688] UBIFS (ubi0:1): UBIFS: mounted UBI device 0, volume 1, name "test1"
[   95.484694] UBIFS (ubi0:1): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048 bytes/2048 bytes
[   95.484699] UBIFS (ubi0:1): FS size: 30220288 bytes (28 MiB, 238 LEBs), journal size 1523712 bytes (1 MiB, 12 LEBs)
[   95.484703] UBIFS (ubi0:1): reserved for root: 1427378 bytes (1393 KiB)
[   95.484709] UBIFS (ubi0:1): media format: w4/r0 (latest is w4/r0), UUID 40DFFC0E-70BE-4193-8905-F7D6DFE60B17, small LPT model
[   95.489875] UBIFS (ubi1:0): background thread "ubifs_bgt1_0" started, PID 699
[   95.529713] UBIFS (ubi1:0): UBIFS: mounted UBI device 1, volume 0, name "test2"
[   95.529718] UBIFS (ubi1:0): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048 bytes/2048 bytes
[   95.529724] UBIFS (ubi1:0): FS size: 19808256 bytes (18 MiB, 156 LEBs), journal size 1015809 bytes (0 MiB, 8 LEBs)
[   95.529727] UBIFS (ubi1:0): reserved for root: 935592 bytes (913 KiB)
[   95.529733] UBIFS (ubi1:0): media format: w4/r0 (latest is w4/r0), UUID EEB7779D-F419-4CA9-811B-831CAC7233D4, small LPT model

[  954.264767] UBIFS error (ubi1:0 pid 756): ubifs_read_node: bad node type (255 but expected 6)
[  954.367030] UBIFS error (ubi1:0 pid 756): ubifs_read_node: bad node at LEB 0:0, LEB mapping status 1

Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2015-03-25 11:08:41 +02:00
Fabian Frederick 8a87dc55f7 UBIFS: simplify returns
Directly return recover_head() and ubifs_leb_unmap()
instead of storing value in err and testing it.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2015-03-25 11:08:41 +02:00
Yannick Guerrini d3f9db00d0 UBIFS: Fix trivial typos in comments
Change 'comress' to 'compress'
Change 'inteval' to 'interval'
Change 'disabe' to 'disable'
Change 'nenver' to 'never'

Signed-off-by: Yannick Guerrini <yguerrini@tomshardware.fr>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2015-03-25 11:08:41 +02:00
Sheng Yong 2c84599ca4 UBIFS: do not write master node if need recovery
The commits 781c571 ("UBIFS: intialize LPT earlier") and 0980119 ("UBIFS:
fix-up free space earlier") move some initialization before marking the
master node dirty. But the modification changes the conditions of writing
master.

If unclean umount happens, ubifs may fail when mounting. But trying to
mount it will write new master nodes on the flash. This is useless but
increasing sqnum. So check need_recovery before writing master node, and
don't create new master node if filesystem needs recovery.

The behavour of the bug shows at:
http://lists.infradead.org/pipermail/linux-mtd/2015-February/057712.html

Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Reviewed-by: Ben Gardiner <ben.l.gardiner@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2015-03-25 11:08:40 +02:00
Taesoo Kim 9401a795c6 UBIFS: fix incorrect unlocking handling
When ubifs_init_security() fails, 'ui_mutex' is incorrectly
unlocked and incorrectly restores 'i_size'. Fix this.

Signed-off-by: Taesoo Kim <tsgatesv@gmail.com>
Fixes: d7f0b70d30 ("UBIFS: Add security.* XATTR support for the UBIFS")
Reviewed-by: Ben Shelton <ben.shelton@ni.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2015-03-25 11:08:40 +02:00
Linus Torvalds 8c988ae787 Merge branch 'for-linus-v3.20' of git://git.infradead.org/linux-ubifs
Pull UBI and UBIFS updates from Richard Weinberger:
 - cleanups and bug fixes all over UBI and UBIFS
 - block-mq support for UBI Block
 - UBI volumes can now be renamed while they are in use
 - security.* XATTR support for UBIFS
 - a maintainer update

* 'for-linus-v3.20' of git://git.infradead.org/linux-ubifs:
  UBI: block: Fix checking for NULL instead of IS_ERR()
  UBI: block: Continue creating ubiblocks after an initialization error
  UBIFS: return -EINVAL if log head is empty
  UBI: Block: Explain usage of blk_rq_map_sg()
  UBI: fix soft lockup in ubi_check_volume()
  UBI: Fastmap: Care about the protection queue
  UBIFS: add a couple of extra asserts
  UBI: do propagate positive error codes up
  UBI: clean-up printing helpers
  UBI: extend UBI layer debug/messaging capabilities - cosmetics
  UBIFS: add ubifs_err() to print error reason
  UBIFS: Add security.* XATTR support for the UBIFS
  UBIFS: Add xattr support for symlinks
  UBI: Block: Add blk-mq support
  UBI: Add initial support for scatter gather
  UBI: rename_volumes: Use UBI_METAONLY
  UBI: Implement UBI_METAONLY
  Add myself as UBI co-maintainer
2015-02-15 10:11:39 -08:00
Linus Torvalds 6bec003528 Merge branch 'for-3.20/bdi' of git://git.kernel.dk/linux-block
Pull backing device changes from Jens Axboe:
 "This contains a cleanup of how the backing device is handled, in
  preparation for a rework of the life time rules.  In this part, the
  most important change is to split the unrelated nommu mmap flags from
  it, but also removing a backing_dev_info pointer from the
  address_space (and inode), and a cleanup of other various minor bits.

  Christoph did all the work here, I just fixed an oops with pages that
  have a swap backing.  Arnd fixed a missing export, and Oleg killed the
  lustre backing_dev_info from staging.  Last patch was from Al,
  unexporting parts that are now no longer needed outside"

* 'for-3.20/bdi' of git://git.kernel.dk/linux-block:
  Make super_blocks and sb_lock static
  mtd: export new mtd_mmap_capabilities
  fs: make inode_to_bdi() handle NULL inode
  staging/lustre/llite: get rid of backing_dev_info
  fs: remove default_backing_dev_info
  fs: don't reassign dirty inodes to default_backing_dev_info
  nfs: don't call bdi_unregister
  ceph: remove call to bdi_unregister
  fs: remove mapping->backing_dev_info
  fs: export inode_to_bdi and use it in favor of mapping->backing_dev_info
  nilfs2: set up s_bdi like the generic mount_bdev code
  block_dev: get bdev inode bdi directly from the block device
  block_dev: only write bdev inode on close
  fs: introduce f_op->mmap_capabilities for nommu mmap support
  fs: kill BDI_CAP_SWAP_BACKED
  fs: deduplicate noop_backing_dev_info
2015-02-12 13:50:21 -08:00
Kirill A. Shutemov d83a08db5b mm: drop vm_ops->remap_pages and generic_file_remap_pages() stub
Nobody uses it anymore.

[akpm@linux-foundation.org: fix filemap_xip.c]
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:30 -08:00
hujianyang 88cff0f0fb UBIFS: return -EINVAL if log head is empty
CS node is recognized as a sign in UBIFS log replay mechanism.
Log relaying during mount should find the CS node in log head
at beginning and then replay the following uncommitted buds.

Here is a bug in log replay path: If the log head, which is
indicated by @log_lnum in mst_node, is empty, current UBIFS
replay nothing and directly mount the partition without any
warning. This action will put filesystem in an abnormal state,
e.g. space management in LPT area is incorrect to the real
space usage in main area.

We reproduced this bug by fault injection: turn log head leb
into all 0xFF. UBIFS driver mount the polluted partition
normally. But errors occur while running fs_stress on this
mount:

[89068.055183] UBI error: ubi_io_read: error -74 (ECC error) while reading 59 bytes from PEB 711:33088, read 59 bytes
[89068.179877] UBIFS error (pid 10517): ubifs_check_node: bad magic 0x101031, expected 0x6101831
[89068.179882] UBIFS error (pid 10517): ubifs_check_node: bad node at LEB 591:28992
[89068.179891] Not a node, first 24 bytes:
[89068.179892] 00000000: 31 10 10 00 37 84 64 04 10 04 00 00 00 00 00 00 20 00 00 00 02 01 00 00                          1...7.d......... .......
[89068.180282] UBIFS error (pid 10517): ubifs_read_node: expected node type 2

This patch fix the problem by checking *lnum* to guarantee
the empty leb is not log head leb and return an error if the
log head leb is incorrectly empty. After this, we could catch
*log head empty* error in place.

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2015-02-10 10:06:24 +02:00
Artem Bityutskiy fb4325a3d9 UBIFS: add a couple of extra asserts
... to catch possible memory corruptions.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2015-01-28 16:09:32 +01:00
Subodh Nijsure fee1756d80 UBIFS: add ubifs_err() to print error reason
This patch adds ubifs_err() output to some error paths to tell the user
what's going on.

Artem: improve the messages, rename too long variable

Signed-off-by: Subodh Nijsure <snijsure@grid-net.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Ben Shelton <ben.shelton@ni.com>
Acked-by: Brad Mouring <brad.mouring@ni.com>
Acked-by: Terry Wilcox <terry.wilcox@ni.com>
Acked-by: Gratian Crisan <gratian.crisan@ni.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2015-01-28 16:09:01 +01:00
Subodh Nijsure d7f0b70d30 UBIFS: Add security.* XATTR support for the UBIFS
Artem: rename static functions so that they do not use the "ubifs_" prefix - we
       only use this prefix for non-static functions.
Artem: remove few junk white-space changes in file.c

Signed-off-by: Subodh Nijsure <snijsure@grid-net.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Ben Shelton <ben.shelton@ni.com>
Acked-by: Brad Mouring <brad.mouring@ni.com>
Acked-by: Terry Wilcox <terry.wilcox@ni.com>
Acked-by: Gratian Crisan <gratian.crisan@ni.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2015-01-28 16:08:54 +01:00
Subodh Nijsure 895d9db253 UBIFS: Add xattr support for symlinks
Artem: rename the __ubifs_setxattr() functions to just 'setxattr()'.

Signed-off-by: Subodh Nijsure <snijsure@grid-net.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Ben Shelton <ben.shelton@ni.com>
Acked-by: Terry Wilcox <terry.wilcox@ni.com>
Acked-by: Gratian Crisan <gratian.crisan@ni.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2015-01-28 16:08:46 +01:00
Christoph Hellwig b83ae6d421 fs: remove mapping->backing_dev_info
Now that we never use the backing_dev_info pointer in struct address_space
we can simply remove it and save 4 to 8 bytes in every inode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Reviewed-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-20 14:03:05 -07:00
Christoph Hellwig b4caecd480 fs: introduce f_op->mmap_capabilities for nommu mmap support
Since "BDI: Provide backing device capability information [try #3]" the
backing_dev_info structure also provides flags for the kind of mmap
operation available in a nommu environment, which is entirely unrelated
to it's original purpose.

Introduce a new nommu-only file operation to provide this information to
the nommu mmap code instead.  Splitting this from the backing_dev_info
structure allows to remove lots of backing_dev_info instance that aren't
otherwise needed, and entirely gets rid of the concept of providing a
backing_dev_info for a character device.  It also removes the need for
the mtd_inodefs filesystem.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Tejun Heo <tj@kernel.org>
Acked-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-20 14:02:58 -07:00
Subodh Nijsure a76284e6f8 UBIFS: fix a couple bugs in UBIFS xattr length calculation
The journal update function did not work for extended attributes properly,
because extended attribute inodes carry the xattr data, and the size of this
data was not taken into account.

Artem: improved commit message, amended the patch a bit.

Signed-off-by: Subodh Nijsure <snijsure@grid-net.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Ben Shelton <ben.shelton@ni.com>
Acked-by: Brad Mouring <brad.mouring@ni.com>
Acked-by: Gratian Crisan <gratian.crisan@ni.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-11-07 12:32:22 +02:00
Artem Bityutskiy 789c89935c UBIFS: fix budget leak in error path
We forgot to free the budget in 'write_begin_slow()' when 'do_readpage()'
fails. This patch fixes the issue.

Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-11-07 12:08:50 +02:00
Richard Weinberger 443b39cdd5 UBIFS: Fix trivial typo in power_cut_emulated()
s/withing/within/

Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-09-30 09:29:44 +03:00
hujianyang 4b1a43eab1 UBIFS: Align the dump messages of SB_NODE
I found the dump messages of UBIFS_SB_NODE is not aligned. This
patch remove the extra space from the line which is retracted.

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-09-26 13:33:41 +03:00
Richard Weinberger d577bc104f UBIFS: Remove bogus assert
This assertion was only correct before UBIFS had xattr support.
Now with xattr support also a directory node can carry data
and can act as host node.

Suggested-by: Artem Bityutskiy <dedekind1@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-09-19 18:11:50 +03:00
Artem Bityutskiy ba29e721eb UBIFS: fix free log space calculation
Hu (hujianyang <hujianyang@huawei.com>) discovered an issue in the
'empty_log_bytes()' function, which calculates how many bytes are left in the
log:

"
If 'c->lhead_lnum + 1 == c->ltail_lnum' and 'c->lhead_offs == c->leb_size', 'h'
would equalent to 't' and 'empty_log_bytes()' would return 'c->log_bytes'
instead of 0.
"

At this point it is not clear what would be the consequences of this, and
whether this may lead to any problems, but this patch addresses the issue just
in case.

Cc: stable@vger.kernel.org
Tested-by: hujianyang <hujianyang@huawei.com>
Reported-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-09-08 15:55:28 +03:00
Artem Bityutskiy 052c28073f UBIFS: fix a race condition
Hu (hujianyang@huawei.com) discovered a race condition which may lead to a
situation when UBIFS is unable to mount the file-system after an unclean
reboot. The problem is theoretical, though.

In UBIFS, we have the log, which basically a set of LEBs in a certain area. The
log has the tail and the head.

Every time user writes data to the file-system, the UBIFS journal grows, and
the log grows as well, because we append new reference nodes to the head of the
log. So the head moves forward all the time, while the log tail stays at the
same position.

At any time, the UBIFS master node points to the tail of the log. When we mount
the file-system, we scan the log, and we always start from its tail, because
this is where the master node points to. The only occasion when the tail of the
log changes is the commit operation.

The commit operation has 2 phases - "commit start" and "commit end". The former
is relatively short, and does not involve much I/O. During this phase we mostly
just build various in-memory lists of the things which have to be written to
the flash media during "commit end" phase.

During the commit start phase, what we do is we "clean" the log. Indeed, the
commit operation will index all the data in the journal, so the entire journal
"disappears", and therefore the data in the log become unneeded. So we just
move the head of the log to the next LEB, and write the CS node there. This LEB
will be the tail of the new log when the commit operation finishes.

When the "commit start" phase finishes, users may write more data to the
file-system, in parallel with the ongoing "commit end" operation. At this point
the log tail was not changed yet, it is the same as it had been before we
started the commit. The log head keeps moving forward, though.

The commit operation now needs to write the new master node, and the new master
node should point to the new log tail. After this the LEBs between the old log
tail and the new log tail can be unmapped and re-used again.

And here is the possible problem. We do 2 operations: (a) We first update the
log tail position in memory (see 'ubifs_log_end_commit()'). (b) And then we
write the master node (see the big lock of code in 'do_commit()').

But nothing prevents the log head from moving forward between (a) and (b), and
the log head may "wrap" now to the old log tail. And when the "wrap" happens,
the contends of the log tail gets erased. Now a power cut happens and we are in
trouble. We end up with the old master node pointing to the old tail, which was
erased. And replay fails because it expects the master node to point to the
correct log tail at all times.

This patch merges the abovementioned (a) and (b) operations by moving the master
node change code to the 'ubifs_log_end_commit()' function, so that it runs with
the log mutex locked, which will prevent the log from being changed benween
operations (a) and (b).

Cc: stable@vger.kernel.org # 07e19df UBIFS: remove mst_mutex
Cc: stable@vger.kernel.org
Reported-by: hujianyang <hujianyang@huawei.com>
Tested-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-09-08 15:55:02 +03:00
hujianyang 25601a3c97 UBIFS: Add log overlap assertions
We use a circle area to record the log nodes in ubifs. This log area
should not be overlapped. But after researching the code, I found
some conditions may lead log head wraps log ltail. Although we've
fixed the problems discovered, there may be some other issues still
left.

This patch adds assertions where lhead changes to next leb to make
sure ltail is not wrapped.

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-07-31 15:52:51 +03:00
Artem Bityutskiy 6390e99177 Revert "UBIFS: add a log overlap assertion"
This reverts commit 545f7fdf6d.

Hujianyang's testing revealed that the patch is bogus.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-07-28 19:15:19 +03:00
Artem Bityutskiy 545f7fdf6d UBIFS: add a log overlap assertion
Add an assertion which checkes that the head of the log never overlaps with the
tail of the log.

Suggested-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-07-19 09:55:19 +03:00
Artem Bityutskiy f1cb705acc UBIFS: remove unnecessary check
Remove the "if (c->lhead_offs == 0)" check because is unnecessary, since
at that point the log head offset is guaranteed to be zero due to the previous
operation.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-07-19 09:54:25 +03:00
Artem Bityutskiy 07e19dff63 UBIFS: remove mst_mutex
The 'mst_mutex' is not needed since because 'ubifs_write_master()' is only
called on the mount path and commit path. The mount path is sequential and
there is no parallelism, and the commit path is also serialized - there is only
one commit going on at a time.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-07-19 09:53:52 +03:00
Fabian Frederick 39274a1e8d UBIFS: kernel-doc warning fix
s/data/timer

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-07-19 09:53:52 +03:00
Fabian Frederick d4eb08ff0a UBIFS: replace seq_printf by seq_puts
Fix checkpatch warnings:
"WARNING: Prefer seq_puts to seq_printf"

Andrew Morton wrote:

"
- puts is presumably faster

- puts doesn't go rogue if you accidentally pass it a "%".

- this patch actually made fs/ubifs/super.o 12 bytes smaller.
  Perhaps because seq_printf() is a varargs function, forcing the
  caller to pass args on the stack instead of in registers.
"

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-07-19 09:53:52 +03:00
Fabian Frederick 86b4c14de3 UBIFS: replace count*size kzalloc by kcalloc
kcalloc manages count*sizeof overflow.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-07-19 09:53:52 +03:00
Fabian Frederick ef13f01828 UBIFS: kernel-doc warning fix
No grouped argument in drop_last_node.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-07-19 09:53:52 +03:00
hujianyang 6dcfb80264 UBIFS: fix error path in create_default_filesystem()
In the end of 'create_default_filesystem()' we need to check
the return value of 'ubifs_write_node()' to ensure that we have
successfully written the 'cs_node'.

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-07-19 09:53:52 +03:00
Artem Bityutskiy f2b6521aa1 UBIFS: fix spelling of "scanned"
Randy Dunlap pointed that we should use "scanned" instead of "scaned". This
patch makes the correction.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-07-19 09:53:51 +03:00
Seunghun Lee d685c41215 UBIFS: fix some comments
This patch fixes some comments about return type.

Signed-off-by: Seunghun Lee <waydi1@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-07-19 09:53:51 +03:00
hujianyang c7b5bb0beb UBIFS: remove useless @ecc in struct ubifs_scan_leb
We set @ecc in ubifs_scan_leb only if leb_read returns EBADMSG and
do not use it any more. This patch removes this variable and adds
comments about EBADMSG handling.

Artem: re-phrase commentaries

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-07-19 09:53:51 +03:00
hujianyang b793a8c888 UBIFS: remove useless statements
This patch removes useless and duplicate statements.

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-07-19 09:53:51 +03:00
hujianyang ce6ebdb87e UBIFS: Add missing break statements in dbg_chk_pnode()
This is a minor fix. These two branches in 'dbg_chk_pnode()'
are dealing with different conditions. Although there is
no fault in current state, I think adding "break"s in
each end of branch is better.

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-07-19 09:53:51 +03:00
hujianyang 5a95741a57 UBIFS: fix error handling in dump_lpt_leb()
This patch checks the return value of 'ubifs_unpack_nnode()'.
If this function returns an error, 'nnode' may not be
initialized, so just print an error message and break.

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-07-19 09:53:51 +03:00
Linus Torvalds 16b9057804 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "This the bunch that sat in -next + lock_parent() fix.  This is the
  minimal set; there's more pending stuff.

  In particular, I really hope to get acct.c fixes merged this cycle -
  we need that to deal sanely with delayed-mntput stuff.  In the next
  pile, hopefully - that series is fairly short and localized
  (kernel/acct.c, fs/super.c and fs/namespace.c).  In this pile: more
  iov_iter work.  Most of prereqs for ->splice_write with sane locking
  order are there and Kent's dio rewrite would also fit nicely on top of
  this pile"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (70 commits)
  lock_parent: don't step on stale ->d_parent of all-but-freed one
  kill generic_file_splice_write()
  ceph: switch to iter_file_splice_write()
  shmem: switch to iter_file_splice_write()
  nfs: switch to iter_splice_write_file()
  fs/splice.c: remove unneeded exports
  ocfs2: switch to iter_file_splice_write()
  ->splice_write() via ->write_iter()
  bio_vec-backed iov_iter
  optimize copy_page_{to,from}_iter()
  bury generic_file_aio_{read,write}
  lustre: get rid of messing with iovecs
  ceph: switch to ->write_iter()
  ceph_sync_direct_write: stop poking into iov_iter guts
  ceph_sync_read: stop poking into iov_iter guts
  new helper: copy_page_from_iter()
  fuse: switch to ->write_iter()
  btrfs: switch to ->write_iter()
  ocfs2: switch to ->write_iter()
  xfs: switch to ->write_iter()
  ...
2014-06-12 10:30:18 -07:00
Al Viro 8d0207652c ->splice_write() via ->write_iter()
iter_file_splice_write() - a ->splice_write() instance that gathers the
pipe buffers, builds a bio_vec-based iov_iter covering those and feeds
it to ->write_iter().  A bunch of simple cases coverted to that...

[AV: fixed the braino spotted by Cyrill]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-06-12 00:18:51 -04:00
Linus Torvalds d53b47c08d This pull request contains several UBIFS fixes. One of them fixes a race
condition between the mmap page fault path and fsync. Another just removes a
 bogus assertion from the UBIFS memory shrinker.
 
 UBIFS also started honoring the MS_SILENT mount flag, so now it won't print
 many I/O errors when user-space just tries to probe for the FS.
 
 Rest of the changes are rather minor UBI/UBIFS fixes, improvements, and
 clean-ups.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJTlqqmAAoJECmIfjd9wqK0cYAP+QGoB8+DOGv4+rTtUegNaWaG
 QAeAkvOBxDa72NrTYMtDFBKiPPYcz0BUnPgsCyvYIlYknRuZ4xMsu1BucLF3y8E3
 P8aHpnNas0dA3el/M+M0qwpEG0pb1lQvFT1dJVP6/D4q4VexLlgt7PlQjP+98Dua
 jp9jBDMu3sEDsqA9NiO2hfuFuIqF6DFnBpglkNcGcAyOtzQ/Ps49nivhb1mFKuzI
 2kSm2kWmZCTnLlGG1uj9+diCpTKWZoES2Jmo6gcZjQIjlejSzQw4Fo8LqdmhfwOg
 0ba1D9kJuwPHmBeM0UW4fLtrJHkvs2F66YaRcbA7GUo5lNw6+yxTqi3jkGevsPOD
 7eQB8FVCco14PFKu8Vx4lbkS2ekZN6aQuYYw/EeY2f+w8MpQTfcWINP5OVNHHGVA
 QDeRiu9mRMbm3JARRe9NeL0+sryGN402jCHtESAhONUDsCmZKX9k8HUMY1iaPAIp
 kGJWOjbyzYRMJiOfQiklv4N6rusmnECiRWGCliKDCU6TER8U6m9ZCUHUKmtX+56c
 2yRtd6y6LVequ/p3wC3x66jF64wjh2PlTM5jrWSOIg42ihIGSMAbmB6MKA+4RxIv
 EejZ4lhCxUg6Xp7zd/qgdu3aJ/P2OM8yFL+BngGKFac54EnTDEUcc7d8c2DZZv8s
 7YYjpiKK1NQGnih0FABA
 =CP7Z
 -----END PGP SIGNATURE-----

Merge tag 'upstream-3.16-rc1-v2' of git://git.infradead.org/linux-ubifs

Pull UBIFS updates from Artem Bityutskiy:
 "This contains several UBIFS fixes.  One of them fixes a race condition
  between the mmap page fault path and fsync.  Another just removes a
  bogus assertion from the UBIFS memory shrinker.

  UBIFS also started honoring the MS_SILENT mount flag, so now it won't
  print many I/O errors when user-space just tries to probe for the FS.

  Rest of the changes are rather minor UBI/UBIFS fixes, improvements,
  and clean-ups"

* tag 'upstream-3.16-rc1-v2' of git://git.infradead.org/linux-ubifs:
  UBIFS: Add an assertion for clean_zn_cnt
  UBIFS: respect MS_SILENT mount flag
  UBIFS: Remove incorrect assertion in shrink_tnc()
  UBIFS: fix debugging check
  UBIFS: add missing ui pointer in debugging code
  UBI: block: Fix error path on alloc_workqueue failure
  UBIFS: Fix dump messages in ubifs_dump_lprops
  UBI: fix rb_tree node comparison in add_map
  UBIFS: Remove unused variables in ubifs_budget_space
  UBI: weaken the 'exclusive' constraint when opening volumes to rename
  UBIFS: fix an mmap and fsync race condition
2014-06-10 08:55:46 -07:00
Linus Torvalds 776edb5931 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull core locking updates from Ingo Molnar:
 "The main changes in this cycle were:

   - reduced/streamlined smp_mb__*() interface that allows more usecases
     and makes the existing ones less buggy, especially in rarer
     architectures

   - add rwsem implementation comments

   - bump up lockdep limits"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
  rwsem: Add comments to explain the meaning of the rwsem's count field
  lockdep: Increase static allocations
  arch: Mass conversion of smp_mb__*()
  arch,doc: Convert smp_mb__*()
  arch,xtensa: Convert smp_mb__*()
  arch,x86: Convert smp_mb__*()
  arch,tile: Convert smp_mb__*()
  arch,sparc: Convert smp_mb__*()
  arch,sh: Convert smp_mb__*()
  arch,score: Convert smp_mb__*()
  arch,s390: Convert smp_mb__*()
  arch,powerpc: Convert smp_mb__*()
  arch,parisc: Convert smp_mb__*()
  arch,openrisc: Convert smp_mb__*()
  arch,mn10300: Convert smp_mb__*()
  arch,mips: Convert smp_mb__*()
  arch,metag: Convert smp_mb__*()
  arch,m68k: Convert smp_mb__*()
  arch,m32r: Convert smp_mb__*()
  arch,ia64: Convert smp_mb__*()
  ...
2014-06-03 12:57:53 -07:00
hujianyang 380347e9ca UBIFS: Add an assertion for clean_zn_cnt
This patch adds a new ubifs_assert() in ubifs_tnc_close() to check
if there are any leaks of per-filesystem @clean_zn_cnt. This new
assert inspects whether the return value of ubifs_destroy_tnc_subtree()
is equal to @clean_zn_cnt or not while umount.

Artem: a minor amendment

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-06-03 11:16:51 +03:00
Daniel Golle 90bea5a3f0 UBIFS: respect MS_SILENT mount flag
When attempting to mount a non-ubifs formatted volume, lots of error
messages (including a stack dump) are thrown to the kernel log even if
the MS_SILENT mount flag is set.
Fix this by introducing adding an additional state-variable in
struct ubifs_info and suppress error messages in ubifs_read_node if
MS_SILENT is set.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-06-02 18:00:52 +03:00
hujianyang 72abc8f4b4 UBIFS: Remove incorrect assertion in shrink_tnc()
I hit the same assert failed as Dolev Raviv reported in Kernel v3.10
shows like this:

[ 9641.164028] UBIFS assert failed in shrink_tnc at 131 (pid 13297)
[ 9641.234078] CPU: 1 PID: 13297 Comm: mmap.test Tainted: G           O 3.10.40 #1
[ 9641.234116] [<c0011a6c>] (unwind_backtrace+0x0/0x12c) from [<c000d0b0>] (show_stack+0x20/0x24)
[ 9641.234137] [<c000d0b0>] (show_stack+0x20/0x24) from [<c0311134>] (dump_stack+0x20/0x28)
[ 9641.234188] [<c0311134>] (dump_stack+0x20/0x28) from [<bf22425c>] (shrink_tnc_trees+0x25c/0x350 [ubifs])
[ 9641.234265] [<bf22425c>] (shrink_tnc_trees+0x25c/0x350 [ubifs]) from [<bf2245ac>] (ubifs_shrinker+0x25c/0x310 [ubifs])
[ 9641.234307] [<bf2245ac>] (ubifs_shrinker+0x25c/0x310 [ubifs]) from [<c00cdad8>] (shrink_slab+0x1d4/0x2f8)
[ 9641.234327] [<c00cdad8>] (shrink_slab+0x1d4/0x2f8) from [<c00d03d0>] (do_try_to_free_pages+0x300/0x544)
[ 9641.234344] [<c00d03d0>] (do_try_to_free_pages+0x300/0x544) from [<c00d0a44>] (try_to_free_pages+0x2d0/0x398)
[ 9641.234363] [<c00d0a44>] (try_to_free_pages+0x2d0/0x398) from [<c00c6a60>] (__alloc_pages_nodemask+0x494/0x7e8)
[ 9641.234382] [<c00c6a60>] (__alloc_pages_nodemask+0x494/0x7e8) from [<c00f62d8>] (new_slab+0x78/0x238)
[ 9641.234400] [<c00f62d8>] (new_slab+0x78/0x238) from [<c031081c>] (__slab_alloc.constprop.42+0x1a4/0x50c)
[ 9641.234419] [<c031081c>] (__slab_alloc.constprop.42+0x1a4/0x50c) from [<c00f80e8>] (kmem_cache_alloc_trace+0x54/0x188)
[ 9641.234459] [<c00f80e8>] (kmem_cache_alloc_trace+0x54/0x188) from [<bf227908>] (do_readpage+0x168/0x468 [ubifs])
[ 9641.234553] [<bf227908>] (do_readpage+0x168/0x468 [ubifs]) from [<bf2296a0>] (ubifs_readpage+0x424/0x464 [ubifs])
[ 9641.234606] [<bf2296a0>] (ubifs_readpage+0x424/0x464 [ubifs]) from [<c00c17c0>] (filemap_fault+0x304/0x418)
[ 9641.234638] [<c00c17c0>] (filemap_fault+0x304/0x418) from [<c00de694>] (__do_fault+0xd4/0x530)
[ 9641.234665] [<c00de694>] (__do_fault+0xd4/0x530) from [<c00e10c0>] (handle_pte_fault+0x480/0xf54)
[ 9641.234690] [<c00e10c0>] (handle_pte_fault+0x480/0xf54) from [<c00e2bf8>] (handle_mm_fault+0x140/0x184)
[ 9641.234716] [<c00e2bf8>] (handle_mm_fault+0x140/0x184) from [<c0316688>] (do_page_fault+0x150/0x3ac)
[ 9641.234737] [<c0316688>] (do_page_fault+0x150/0x3ac) from [<c000842c>] (do_DataAbort+0x3c/0xa0)
[ 9641.234759] [<c000842c>] (do_DataAbort+0x3c/0xa0) from [<c0314e38>] (__dabt_usr+0x38/0x40)

After analyzing the code, I found a condition that may cause this failed
in correct operations. Thus, I think this assertion is wrong and should be
removed.

Suppose there are two clean znodes and one dirty znode in TNC. So the
per-filesystem atomic_t @clean_zn_cnt is (2). If commit start, dirty_znode
is set to COW_ZNODE in get_znodes_to_commit() in case of potentially ops
on this znode. We clear COW bit and DIRTY bit in write_index() without
@tnc_mutex locked. We don't increase @clean_zn_cnt in this place. As the
comments in write_index() shows, if another process hold @tnc_mutex and
dirty this znode after we clean it, @clean_zn_cnt would be decreased to (1).
We will increase @clean_zn_cnt to (2) with @tnc_mutex locked in
free_obsolete_znodes() to keep it right.

If shrink_tnc() performs between decrease and increase, it will release
other 2 clean znodes it holds and found @clean_zn_cnt is less than zero
(1 - 2 = -1), then hit the assertion. Because free_obsolete_znodes() will
soon correct @clean_zn_cnt and no harm to fs in this case, I think this
assertion could be removed.

2 clean zondes and 1 dirty znode, @clean_zn_cnt == 2

Thread A (commit)         Thread B (write or others)       Thread C (shrinker)
->write_index
   ->clear_bit(DIRTY_NODE)
   ->clear_bit(COW_ZNODE)

            @clean_zn_cnt == 2
                          ->mutex_locked(&tnc_mutex)
                          ->dirty_cow_znode
                              ->!ubifs_zn_cow(znode)
                              ->!test_and_set_bit(DIRTY_NODE)
                              ->atomic_dec(&clean_zn_cnt)
                          ->mutex_unlocked(&tnc_mutex)

            @clean_zn_cnt == 1
                                                           ->mutex_locked(&tnc_mutex)
                                                           ->shrink_tnc
                                                             ->destroy_tnc_subtree
                                                             ->atomic_sub(&clean_zn_cnt, 2)
                                                             ->ubifs_assert  <- hit
                                                           ->mutex_unlocked(&tnc_mutex)

            @clean_zn_cnt == -1
->mutex_lock(&tnc_mutex)
->free_obsolete_znodes
   ->atomic_inc(&clean_zn_cnt)
->mutux_unlock(&tnc_mutex)

            @clean_zn_cnt == 0 (correct after shrink)

Signed-off-by: hujianyang <hujianyang@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-06-02 11:28:24 +03:00
Artem Bityutskiy ba6a7d5563 UBIFS: fix debugging check
The debugging check which verifies that we never write outside of the file
length was incorrect, since it was multiplying file length by the page size,
instead of dividing. Fix this.

Spotted-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-05-28 11:10:09 +03:00
Daniel Golle a0fd59511e UBIFS: add missing ui pointer in debugging code
If UBIFS_DEBUG is defined an additional assertion of the ui_lock
spinlock in do_writepage cannot compile because the ui pointer has not
been previously declared.

Fix this by declaring and initializing the ui pointer in case
UBIFS_DEBUG is defined.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-05-27 15:23:40 +03:00
hujianyang dac3698147 UBIFS: Fix dump messages in ubifs_dump_lprops
Function ubifs_read_one_lp will not set @lp and returns
an error when ubifs_read_one_lp failed. We should not
perform ubifs_dump_lprop in this case because @lp is not
initialized as we wanted.

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-05-27 15:05:23 +03:00
hujianyang 0da846f42f UBIFS: Remove unused variables in ubifs_budget_space
I found two variables in ubifs_budget_space declared but not
use. This state remains since the first commit 1e5176. So just
remove them.

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-05-13 13:45:16 +03:00
hujianyang 691a7c6f28 UBIFS: fix an mmap and fsync race condition
There is a race condition in UBIFS:

Thread A (mmap)                        Thread B (fsync)

->__do_fault                           ->write_cache_pages
   -> ubifs_vm_page_mkwrite
       -> budget_space
       -> lock_page
       -> release/convert_page_budget
       -> SetPagePrivate
       -> TestSetPageDirty
       -> unlock_page
                                       -> lock_page
                                           -> TestClearPageDirty
                                           -> ubifs_writepage
                                               -> do_writepage
                                                   -> release_budget
                                                   -> ClearPagePrivate
                                                   -> unlock_page
   -> !(ret & VM_FAULT_LOCKED)
   -> lock_page
   -> set_page_dirty
       -> ubifs_set_page_dirty
           -> TestSetPageDirty (set page dirty without budgeting)
   -> unlock_page

This leads to situation where we have a diry page but no budget allocated for
this page, so further write-back may fail with -ENOSPC.

In this fix we return from page_mkwrite without performing unlock_page. We
return VM_FAULT_LOCKED instead. After doing this, the race above will not
happen.

Signed-off-by: hujianyang <hujianyang@huawei.com>
Tested-by: Laurence Withers <lwithers@guralp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-05-13 13:45:15 +03:00
Al Viro f5674c31ee ubifs: switch to ->write_iter()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-05-06 17:39:38 -04:00
Al Viro aad4f8bb42 switch simple generic_file_aio_read() users to ->read_iter()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-05-06 17:37:55 -04:00
Artem Bityutskiy fcdd57c890 UBIFS: fix remount error path
Dan's "smatch" checker found out that there was a bug in the error path of the
'ubifs_remount_rw()' function. Instead of jumping to the "out" label which
cleans-things up, we just returned.

This patch fixes the problem.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-05-05 09:31:33 +03:00
Peter Zijlstra 4e857c58ef arch: Mass conversion of smp_mb__*()
Mostly scripted conversion of the smp_mb__* barriers.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-18 14:20:48 +02:00
Kirill A. Shutemov f1820361f8 mm: implement ->map_pages for page cache
filemap_map_pages() is generic implementation of ->map_pages() for
filesystems who uses page cache.

It should be safe to use filemap_map_pages() for ->map_pages() if
filesystem use filemap_fault() for ->fault().

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ning Qu <quning@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07 16:35:53 -07:00
Linus Torvalds 24e7ea3bea Major changes for 3.14 include support for the newly added ZERO_RANGE
and COLLAPSE_RANGE fallocate operations, and scalability improvements
 in the jbd2 layer and in xattr handling when the extended attributes
 spill over into an external block.
 
 Other than that, the usual clean ups and minor bug fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJTPbD2AAoJENNvdpvBGATwDmUQANSfGYIQazB8XKKgtNTMiG/Y
 Ky7n1JzN9lTX/6nMsqQnbfCweLRmxqpWUBuyKDRHUi8IG0/voXSTFsAOOgz0R15A
 ERRRWkVvHixLpohuL/iBdEMFHwNZYPGr3jkm0EIgzhtXNgk5DNmiuMwvHmCY27kI
 kdNZIw9fip/WRNoFLDBGnLGC37aanoHhCIbVlySy5o9LN1pkC8BgXAYV0Rk19SVd
 bWCudSJEirFEqWS5H8vsBAEm/ioxTjwnNL8tX8qms6orZ6h8yMLFkHoIGWPw3Q15
 a0TSUoMyav50Yr59QaDeWx9uaPQVeK41wiYFI2rZOnyG2ts0u0YXs/nLwJqTovgs
 rzvbdl6cd3Nj++rPi97MTA7iXK96WQPjsDJoeeEgnB0d/qPyTk6mLKgftzLTNgSa
 ZmWjrB19kr6CMbebMC4L6eqJ8Fr66pCT8c/iue8wc4MUHi7FwHKH64fqWvzp2YT/
 +165dqqo2JnUv7tIp6sUi1geun+bmDHLZFXgFa7fNYFtcU3I+uY1mRr3eMVAJndA
 2d6ASe/KhQbpVnjKJdQ8/b833ZS3p+zkgVPrd68bBr3t7gUmX91wk+p1ct6rUPLr
 700F+q/pQWL8ap0pU9Ht/h3gEJIfmRzTwxlOeYyOwDseqKuS87PSB3BzV3dDunSU
 DrPKlXwIgva7zq5/S0Vr
 =4s1Z
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:
 "Major changes for 3.14 include support for the newly added ZERO_RANGE
  and COLLAPSE_RANGE fallocate operations, and scalability improvements
  in the jbd2 layer and in xattr handling when the extended attributes
  spill over into an external block.

  Other than that, the usual clean ups and minor bug fixes"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (42 commits)
  ext4: fix premature freeing of partial clusters split across leaf blocks
  ext4: remove unneeded test of ret variable
  ext4: fix comment typo
  ext4: make ext4_block_zero_page_range static
  ext4: atomically set inode->i_flags in ext4_set_inode_flags()
  ext4: optimize Hurd tests when reading/writing inodes
  ext4: kill i_version support for Hurd-castrated file systems
  ext4: each filesystem creates and uses its own mb_cache
  fs/mbcache.c: doucple the locking of local from global data
  fs/mbcache.c: change block and index hash chain to hlist_bl_node
  ext4: Introduce FALLOC_FL_ZERO_RANGE flag for fallocate
  ext4: refactor ext4_fallocate code
  ext4: Update inode i_size after the preallocation
  ext4: fix partial cluster handling for bigalloc file systems
  ext4: delete path dealloc code in ext4_ext_handle_uninitialized_extents
  ext4: only call sync_filesystm() when remounting read-only
  fs: push sync_filesystem() down to the file system's remount_fs()
  jbd2: improve error messages for inconsistent journal heads
  jbd2: minimize region locked by j_list_lock in jbd2_journal_forget()
  jbd2: minimize region locked by j_list_lock in journal_get_create_access()
  ...
2014-04-04 15:39:39 -07:00
Johannes Weiner 91b0abe36a mm + fs: store shadow entries in page cache
Reclaim will be leaving shadow entries in the page cache radix tree upon
evicting the real page.  As those pages are found from the LRU, an
iput() can lead to the inode being freed concurrently.  At this point,
reclaim must no longer install shadow pages because the inode freeing
code needs to ensure the page tree is really empty.

Add an address_space flag, AS_EXITING, that the inode freeing code sets
under the tree lock before doing the final truncate.  Reclaim will check
for this flag before installing shadow pages.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Metin Doslu <metin@citusdata.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Ozgun Erdogan <ozgun@citusdata.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin <klamm@yandex-team.ru>
Cc: Ryan Mallon <rmallon@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03 16:21:01 -07:00
Theodore Ts'o 02b9984d64 fs: push sync_filesystem() down to the file system's remount_fs()
Previously, the no-op "mount -o mount /dev/xxx" operation when the
file system is already mounted read-write causes an implied,
unconditional syncfs().  This seems pretty stupid, and it's certainly
documented or guaraunteed to do this, nor is it particularly useful,
except in the case where the file system was mounted rw and is getting
remounted read-only.

However, it's possible that there might be some file systems that are
actually depending on this behavior.  In most file systems, it's
probably fine to only call sync_filesystem() when transitioning from
read-write to read-only, and there are some file systems where this is
not needed at all (for example, for a pseudo-filesystem or something
like romfs).

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: linux-fsdevel@vger.kernel.org
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Cc: Jan Kara <jack@suse.cz>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Anders Larsen <al@alarsen.net>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: Petr Vandrovec <petr@vandrovec.name>
Cc: xfs@oss.sgi.com
Cc: linux-btrfs@vger.kernel.org
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Cc: codalist@coda.cs.cmu.edu
Cc: linux-ext4@vger.kernel.org
Cc: linux-f2fs-devel@lists.sourceforge.net
Cc: fuse-devel@lists.sourceforge.net
Cc: cluster-devel@redhat.com
Cc: linux-mtd@lists.infradead.org
Cc: jfs-discussion@lists.sourceforge.net
Cc: linux-nfs@vger.kernel.org
Cc: linux-nilfs@vger.kernel.org
Cc: linux-ntfs-dev@lists.sourceforge.net
Cc: ocfs2-devel@oss.oracle.com
Cc: reiserfs-devel@vger.kernel.org
2014-03-13 10:14:33 -04:00
Cody P Schafer bb25e49ff8 fs/ubifs: use rbtree postorder iteration helper instead of opencoding
Use rbtree_postorder_for_each_entry_safe() to destroy the rbtree instead
of opencoding an alternate postorder iteration that modifies the tree

Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23 16:37:03 -08:00
Linus Torvalds 9bc9ccd7db Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "All kinds of stuff this time around; some more notable parts:

   - RCU'd vfsmounts handling
   - new primitives for coredump handling
   - files_lock is gone
   - Bruce's delegations handling series
   - exportfs fixes

  plus misc stuff all over the place"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (101 commits)
  ecryptfs: ->f_op is never NULL
  locks: break delegations on any attribute modification
  locks: break delegations on link
  locks: break delegations on rename
  locks: helper functions for delegation breaking
  locks: break delegations on unlink
  namei: minor vfs_unlink cleanup
  locks: implement delegations
  locks: introduce new FL_DELEG lock flag
  vfs: take i_mutex on renamed file
  vfs: rename I_MUTEX_QUOTA now that it's not used for quotas
  vfs: don't use PARENT/CHILD lock classes for non-directories
  vfs: pull ext4's double-i_mutex-locking into common code
  exportfs: fix quadratic behavior in filehandle lookup
  exportfs: better variable name
  exportfs: move most of reconnect_path to helper function
  exportfs: eliminate unused "noprogress" counter
  exportfs: stop retrying once we race with rename/remove
  exportfs: clear DISCONNECTED on all parents sooner
  exportfs: more detailed comment for path_reconnect
  ...
2013-11-13 15:34:18 +09:00
Linus Torvalds fbe43ff003 Mostly fixes for the power cut emulation UBIFS mode, and only one functional
change which fixes a return error code.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJSgh3wAAoJECmIfjd9wqK0dxwQALF6rMiFldVa9uZnNgOawzA9
 yB8OR2EmQilZzCcSCLKaPIwzc7izJO7H6hbaLZrNY9UHzjoCwLhTGkbdkuGwkKuF
 Q+M99MKGnYghNWsOWu50J7wewR9zJPD1UGy3ZU/CJXfQzK4uwHIEOiXss+b0kZCo
 sW5ogFOiFzEkz8/CtGwffPXNWuqwPXbDBVHXBjnmv9MCCzsSRdGEpNClCOI/qUHQ
 pFZhKVHSiqYDPvvwQ/pTGv6kw2tu2pVh6+O232g35JB9G3NHq7MDxg0RLr61C7qK
 3ZK6K34r6D09NKhjrOOAsf/EgzyNXtcIL9ySYamaZzEsMGBb2LNMneu3gkrk7O49
 5CXi5Em4cg8eBzhEEmHDxCZanwPhggnuLh+YPlaBBNU/z3NiYRNYoHRjzQWCFwa/
 yee6AzU+w1i0NRtjpPIAmpaqkryJnAZ+c2Hthlm+DPtN3DzWX13QxHdWTxC3Xycc
 8aZNqC128meDqrKRxKkhh1H5TLRdsVTkC9yf0Iv8U8nQg4aT/dPJedNOI4n20Oqy
 C8LnABH6wtc7MImzA5xRbk7CHI0Feb8h6darSKWSLydKNJmvyJ6nRdsi5tnfBEyn
 kbLdYed833tCll86G8iYMwt3jO0DURn+yNlQrTsf3kvNTADvJdzAoP3tudgDxdIA
 T7Uxs/2VFIFAlDK+U5Aw
 =ap2l
 -----END PGP SIGNATURE-----

Merge tag 'upstream-3.13-rc1' of git://git.infradead.org/linux-ubifs

Pull ubifs changes from Artem Bityutskiy:
 "Mostly fixes for the power cut emulation UBIFS mode, and only one
  functional change which fixes a return error code"

* tag 'upstream-3.13-rc1' of git://git.infradead.org/linux-ubifs:
  UBIFS: correct data corruption range
  UBIFS: fix return code
  UBIFS: remove unnecessary code in ubifs_garbage_collect
2013-11-13 15:28:45 +09:00
Mats Kärrman 58a4e23703 UBIFS: correct data corruption range
With power-cut emulation, it is possible that sometimes no data at all is
corrupted and that confusing messages are printed due to errors in the
computation of data corruption range.

[1] The start of the range should be [0..len-1], not [0..len].
[2] The end of the range should always be at least 1 greater than the start.

Signed-off-by: Mats Karrman <mats.karrman@tritech.se>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2013-10-26 11:33:38 +01:00
Wei Yongjun 7203db97b7 UBIFS: fix return code
Fix to return -ENOMEM in the kmalloc() and d_make_root() error handling
case instead of 0, as done elsewhere in those functions.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2013-10-26 11:11:59 +01:00
Al Viro 4cb2a01d8c ubifs: switch to %pd
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-10-24 23:34:51 -04:00
wang.bo116@zte.com.cn e71d1a59e7 UBIFS: remove unnecessary code in ubifs_garbage_collect
In ubifs_garbage_collect,local variable "space_before" calculate twice. In
fact, at the beginning of the loop, there is no need to calculate this
variable. Calculate it before call "ubifs_garbage_collect_leb" is enough. This
patch just remove the unnecessary calculate code.

Signed-off-by: wang bo <wang.bo116@zte.com.cn>
Acked-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2013-10-22 13:34:27 +01:00
Linus Torvalds 098e7f1665 Just one patch which fixes the power-cut recovery testing mode.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJSNrA6AAoJECmIfjd9wqK0r60P/ijFSSZxYEr5/ChOVt1Jjs/q
 cx0FcOO3r4RnXJXEQ9yNNlHWDZ+ZWYrSalaaKAAeh0WGvmCkHEyUbrAuL3Y76GEw
 O37eM9Qlbpb23iQ+gTtapIhdBjABGwo556UebzUsSkJZef+B7aCdgxNjOYAYitF6
 mcG3dndj91XUuhNd+93R8ovVHFjXwndruCYp+UsAajSHYGs3ThocWXXVRF/Rv0mG
 GDeJD4MGuNOGG5t6WjeOYlVE5WuDHJBUYRoUqhnzHfEx7hQ60m26H6Oir8ncXj7/
 3IIrfkF9pbIFiQ1jBRmcGFzzaY2UTqXaDoZN5MUc1w/1DH9PGkfeF7OfpREvDIJY
 rvbT/lX/iHUbQ7lQ+CBZqc3orJT0t1nJy/mhtRy3rb2xFf2gRaFwMwuLPFgeBarm
 hbUpZu3VQpi0Anx7pTavbYn5ZCoobBHvnzuOGg/2EjOFhW0baTnXzmXgHGoJAW+v
 ZxcLEMsTFERr3T6pqxu6v9CNL3DVkO2jvKNR/0I30cE4XDjcd81tXvOAfw0pVp3x
 bEhWLJSG2UFybQ2/PLgvuTriZ4wuJ2Mw5KCGmfp3i0IM9J7/1e9tMNvUOickcnz2
 qkSFuL8Ee47QmTV95tdRwM2T679MXmDoPY6QulIl2bSMnshfMEKbL83wNCpVzXee
 wwV0z4EbGlNtbR254LVF
 =0WcB
 -----END PGP SIGNATURE-----

Merge tag 'upstream-3.12-rc1' of git://git.infradead.org/linux-ubifs

Pull ubifs fix from Artem Bityutskiy:
 "Just one patch which fixes the power-cut recovery testing mode.

  I'll start using a single UBI/UBIFS tree instead of 2 trees from now
  on.  So in the future you'll get 1 small pull request instead of 2
  tiny ones"

* tag 'upstream-3.12-rc1' of git://git.infradead.org/linux-ubifs:
  UBIFS: remove invalid warn msg with tst_recovery enabled
2013-09-16 15:36:55 -04:00
Dave Chinner 1ab6c4997e fs: convert fs shrinkers to new scan/count API
Convert the filesystem shrinkers to use the new API, and standardise some
of the behaviours of the shrinkers at the same time.  For example,
nr_to_scan means the number of objects to scan, not the number of objects
to free.

I refactored the CIFS idmap shrinker a little - it really needs to be
broken up into a shrinker per tree and keep an item count with the tree
root so that we don't need to walk the tree every time the shrinker needs
to count the number of objects in the tree (i.e.  all the time under
memory pressure).

[glommer@openvz.org: fixes for ext4, ubifs, nfs, cifs and glock. Fixes are needed mainly due to new code merged in the tree]
[assorted fixes folded in]
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Glauber Costa <glommer@openvz.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Acked-by: Jan Kara <jack@suse.cz>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Rientjes <rientjes@google.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: J. Bruce Fields <bfields@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-10 18:56:31 -04:00
Mats Kärrman c23e9b75cc UBIFS: remove invalid warn msg with tst_recovery enabled
Signed-off-by: Mats Karrman <mats.karrman@tritech.se>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2013-08-16 16:42:08 +03:00
Linus Torvalds 2dd1cb5a7e Only a single patch which fixes a message.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJR1to/AAoJECmIfjd9wqK0t80P/3p1iovDYk+bERO0W2AYSWRO
 vDXxdT95xmP9qT81+nd3/Y5WYAl86LLftct+CrecJv4NtwkK4Fb+By3V8TpBS4VH
 ROoTXRTiXe8krrszi+6w9lNCuomqmlJoBC3sbTwRuFyGa4XPKYCoawbEu1OySPmW
 cK/z633usm+k67KZ184B/KW+3znFs0G6fZ18mKRT9ImfCpP6E/xq5a7+xYrLdEcg
 MdI3hRZIpEfDVl4RX12UhxwHXcMzc/tLRYBoWf3ukJaKSRuVfoQMh+slc7L9RlAY
 YaIr2Q4qiUSsv4j7+Hq18sSQwvmi7rLxHyG6ZbN9PU8uzPAlXqI1sqhNKwMVXNyh
 AJWtHkkt/nZY+vZmCsOZY7whGtVKoumCqSWWMai/FyXmZN79LZRY5cC/MuC49UQX
 4I6tA6S7KCQSo2TM8s/iyqQ5xpzHNGaMnJ0DJ3XJaDI0B7GhJ1sv9EB2F7ZA96je
 hvLnlGklPyPz/QuKP7tCa3OakXm0Lns6uietzRxNARb5feMfBr7/ACSuPPcoc8hg
 v3YDIrXGeffg8kQmIsbYCGkyv1Il/Axh8bVvArJHf7YF21z+3ROXkOnzeqdTfuHQ
 rnFKgPrT0pxLoCz1sh0lS5QCVpxatUvYvVN+M3iVhDjhCm73mOssnYqJ3DjSJfO/
 zFppRcF9D3bHCjyOYICt
 =5YQr
 -----END PGP SIGNATURE-----

Merge tag 'upstream-3.11-rc1' of git://git.infradead.org/linux-ubifs

Pull ubifs fix from Artem Bityutskiy:
 "Only a single patch which fixes a message"

* tag 'upstream-3.11-rc1' of git://git.infradead.org/linux-ubifs:
  UBIFS: correct mount message
2013-07-05 12:08:47 -07:00
Linus Torvalds 9e239bb939 Lots of bug fixes, cleanups and optimizations. In the bug fixes
category, of note is a fix for on-line resizing file systems where the
 block size is smaller than the page size (i.e., file systems 1k blocks
 on x86, or more interestingly file systems with 4k blocks on Power or
 ia64 systems.)
 
 In the cleanup category, the ext4's punch hole implementation was
 significantly improved by Lukas Czerner, and now supports bigalloc
 file systems.  In addition, Jan Kara significantly cleaned up the
 write submission code path.  We also improved error checking and added
 a few sanity checks.
 
 In the optimizations category, two major optimizations deserve
 mention.  The first is that ext4_writepages() is now used for
 nodelalloc and ext3 compatibility mode.  This allows writes to be
 submitted much more efficiently as a single bio request, instead of
 being sent as individual 4k writes into the block layer (which then
 relied on the elevator code to coalesce the requests in the block
 queue).  Secondly, the extent cache shrink mechanism, which was
 introduce in 3.9, no longer has a scalability bottleneck caused by the
 i_es_lru spinlock.  Other optimizations include some changes to reduce
 CPU usage and to avoid issuing empty commits unnecessarily.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABCAAGBQJR0XhgAAoJENNvdpvBGATwMXkQAJwTPk5XYLqtAwLziFLvM6wG
 0tWa1QAzTNo80tLyM9iGqI6x74X5nddLw5NMICUmPooOa9agMuA4tlYVSss5jWzV
 yyB7vLzsc/2eZJusuVqfTKrdGybE+M766OI6VO9WodOoIF1l51JXKjktKeaWegfv
 NkcLKlakD4V+ZASEDB/cOcR/lTwAs9dQ89AZzgPiW+G8Do922QbqkENJB8mhalbg
 rFGX+lu9W0f3fqdmT3Xi8KGn3EglETdVd6jU7kOZN4vb5LcF5BKHQnnUmMlpeWMT
 ksOVasb3RZgcsyf5ZOV5feXV601EsNtPBrHAmH22pWQy3rdTIvMv/il63XlVUXZ2
 AXT3cHEvNQP0/yVaOTCZ9xQVxT8sL4mI6kENP9PtNuntx7E90JBshiP5m24kzTZ/
 zkIeDa+FPhsDx1D5EKErinFLqPV8cPWONbIt/qAgo6663zeeIyMVhzxO4resTS9k
 U2QEztQH+hDDbjgABtz9M/GjSrohkTYNSkKXzhTjqr/m5huBrVMngjy/F4/7G7RD
 vSEx5aXqyagnrUcjsupx+biJ1QvbvZWOVxAE/6hNQNRGDt9gQtHAmKw1eG2mugHX
 +TFDxodNE4iWEURenkUxXW3mDx7hFbGZR0poHG3M/LVhKMAAAw0zoKrrUG5c70G7
 XrddRLGlk4Hf+2o7/D7B
 =SwaI
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 update from Ted Ts'o:
 "Lots of bug fixes, cleanups and optimizations.  In the bug fixes
  category, of note is a fix for on-line resizing file systems where the
  block size is smaller than the page size (i.e., file systems 1k blocks
  on x86, or more interestingly file systems with 4k blocks on Power or
  ia64 systems.)

  In the cleanup category, the ext4's punch hole implementation was
  significantly improved by Lukas Czerner, and now supports bigalloc
  file systems.  In addition, Jan Kara significantly cleaned up the
  write submission code path.  We also improved error checking and added
  a few sanity checks.

  In the optimizations category, two major optimizations deserve
  mention.  The first is that ext4_writepages() is now used for
  nodelalloc and ext3 compatibility mode.  This allows writes to be
  submitted much more efficiently as a single bio request, instead of
  being sent as individual 4k writes into the block layer (which then
  relied on the elevator code to coalesce the requests in the block
  queue).  Secondly, the extent cache shrink mechanism, which was
  introduce in 3.9, no longer has a scalability bottleneck caused by the
  i_es_lru spinlock.  Other optimizations include some changes to reduce
  CPU usage and to avoid issuing empty commits unnecessarily."

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (86 commits)
  ext4: optimize starting extent in ext4_ext_rm_leaf()
  jbd2: invalidate handle if jbd2_journal_restart() fails
  ext4: translate flag bits to strings in tracepoints
  ext4: fix up error handling for mpage_map_and_submit_extent()
  jbd2: fix theoretical race in jbd2__journal_restart
  ext4: only zero partial blocks in ext4_zero_partial_blocks()
  ext4: check error return from ext4_write_inline_data_end()
  ext4: delete unnecessary C statements
  ext3,ext4: don't mess with dir_file->f_pos in htree_dirblock_to_tree()
  jbd2: move superblock checksum calculation to jbd2_write_superblock()
  ext4: pass inode pointer instead of file pointer to punch hole
  ext4: improve free space calculation for inline_data
  ext4: reduce object size when !CONFIG_PRINTK
  ext4: improve extent cache shrink mechanism to avoid to burn CPU time
  ext4: implement error handling of ext4_mb_new_preallocation()
  ext4: fix corruption when online resizing a fs with 1K block size
  ext4: delete unused variables
  ext4: return FIEMAP_EXTENT_UNKNOWN for delalloc extents
  jbd2: remove debug dependency on debug_fs and update Kconfig help text
  jbd2: use a single printk for jbd_debug()
  ...
2013-07-02 09:39:34 -07:00
Al Viro 01122e0688 [readdir] convert ubifs
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:56:25 +04:00
Artem Bityutskiy 605c912bb8 UBIFS: fix a horrid bug
Al Viro pointed me to the fact that '->readdir()' and '->llseek()' have no
mutual exclusion, which means the 'ubifs_dir_llseek()' can be run while we are
in the middle of 'ubifs_readdir()'.

This means that 'file->private_data' can be freed while 'ubifs_readdir()' uses
it, and this is a very bad bug: not only 'ubifs_readdir()' can return garbage,
but this may corrupt memory and lead to all kinds of problems like crashes an
security holes.

This patch fixes the problem by using the 'file->f_version' field, which
'->llseek()' always unconditionally sets to zero. We set it to 1 in
'ubifs_readdir()' and whenever we detect that it became 0, we know there was a
seek and it is time to clear the state saved in 'file->private_data'.

I tested this patch by writing a user-space program which runds readdir and
seek in parallell. I could easily crash the kernel without these patches, but
could not crash it with these patches.

Cc: stable@vger.kernel.org
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:45:37 +04:00