Commit Graph

468592 Commits

Author SHA1 Message Date
Theodore Ts'o 754cfed6bb ext4: drop the EXT4_STATE_DELALLOC_RESERVED flag
Having done a full regression test, we can now drop the
DELALLOC_RESERVED state flag.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
2014-09-04 18:08:22 -04:00
Theodore Ts'o e3cf5d5d9a ext4: prepare to drop EXT4_STATE_DELALLOC_RESERVED
The EXT4_STATE_DELALLOC_RESERVED flag was originally implemented
because it was too hard to make sure the mballoc and get_block flags
could be reliably passed down through all of the codepaths that end up
calling ext4_mb_new_blocks().

Since then, we have mb_flags passed down through most of the code
paths, so getting rid of EXT4_STATE_DELALLOC_RESERVED isn't as tricky
as it used to.

This commit plumbs in the last of what is required, and then adds a
WARN_ON check to make sure we haven't missed anything.  If this passes
a full regression test run, we can then drop
EXT4_STATE_DELALLOC_RESERVED.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
2014-09-04 18:07:25 -04:00
Theodore Ts'o a521100231 ext4: pass allocation_request struct to ext4_(alloc,splice)_branch
Instead of initializing the allocation_request structure in
ext4_alloc_branch(), set it up in ext4_ind_map_blocks(), and then pass
it to ext4_alloc_branch() and ext4_splice_branch().

This allows ext4_ind_map_blocks to pass flags in the allocation
request structure without having to add Yet Another argument to
ext4_alloc_branch().

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
2014-09-04 18:06:25 -04:00
Zheng Liu eb68d0e2fc ext4: track extent status tree shrinker delay statictics
This commit adds some statictics in extent status tree shrinker.  The
purpose to add these is that we want to collect more details when we
encounter a stall caused by extent status tree shrinker.  Here we count
the following statictics:
  stats:
    the number of all objects on all extent status trees
    the number of reclaimable objects on lru list
    cache hits/misses
    the last sorted interval
    the number of inodes on lru list
  average:
    scan time for shrinking some objects
    the number of shrunk objects
  maximum:
    the inode that has max nr. of objects on lru list
    the maximum scan time for shrinking some objects

The output looks like below:
  $ cat /proc/fs/ext4/sda1/es_shrinker_info
  stats:
    28228 objects
    6341 reclaimable objects
    5281/631 cache hits/misses
    586 ms last sorted interval
    250 inodes on lru list
  average:
    153 us scan time
    128 shrunk objects
  maximum:
    255 inode (255 objects, 198 reclaimable)
    125723 us max scan time

If the lru list has never been sorted, the following line will not be
printed:
    586ms last sorted interval
If there is an empty lru list, the following lines also will not be
printed:
    250 inodes on lru list
  ...
  maximum:
    255 inode (255 objects, 198 reclaimable)
    0 us max scan time

Meanwhile in this commit a new trace point is defined to print some
details in __ext4_es_shrink().

Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-01 22:26:49 -04:00
Zheng Liu e963bb1de4 ext4: improve extents status tree trace point
This commit improves the trace point of extents status tree.  We rename
trace_ext4_es_shrink_enter in ext4_es_count() because it is also used
in ext4_es_scan() and we can not identify them from the result.

Further this commit fixes a variable name in trace point in order to
keep consistency with others.

Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-01 22:22:13 -04:00
Seunghun Lee d91bd2c1d7 ext4: fix comments about get_blocks
get_blocks is renamed to get_block.

Signed-off-by: Seunghun Lee <waydi1@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-01 22:15:30 -04:00
Darrick J. Wong 45f1a9c3f6 ext4: enable block_validity by default
Enable by default the block_validity feature, which checks for
collisions between newly allocated blocks and critical system
metadata.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-01 21:34:09 -04:00
Theodore Ts'o 88fe1acb5b jbd2: fold __wait_cp_io into jbd2_log_do_checkpoint()
__wait_cp_io() is only called by jbd2_log_do_checkpoint().  Fold it in
to make it a bit easier to understand.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-01 21:26:09 -04:00
Theodore Ts'o be1158cc61 jbd2: fold __process_buffer() into jbd2_log_do_checkpoint()
__process_buffer() is only called by jbd2_log_do_checkpoint(), and it
had a very complex locking protocol where it would be called with the
j_list_lock, and sometimes exit with the lock held (if the return code
was 0), or release the lock.

This was confusing both to humans and to smatch (which erronously
complained that the lock was taken twice).

Folding __process_buffer() to the caller allows us to simplify the
control flow, making the resulting function easier to read and reason
about, and dropping the compiled size of fs/jbd2/checkpoint.c by 150
bytes (over 4% of the text size).

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
2014-09-01 21:19:01 -04:00
Theodore Ts'o ed8a1a766a ext4: rename ext4_ext_find_extent() to ext4_find_extent()
Make the function name less redundant.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-01 14:43:09 -04:00
Theodore Ts'o 3bdf14b4d7 ext4: reuse path object in ext4_move_extents()
Reuse the path object in ext4_move_extents() so we don't unnecessarily
free and reallocate it.

Also clean up the get_ext_path() wrapper so that it has the same
semantics of freeing the path object on error as ext4_ext_find_extent().

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-01 14:42:09 -04:00
Theodore Ts'o ee4bd0d963 ext4: reuse path object in ext4_ext_shift_extents()
Now that the semantics of ext4_ext_find_extent() are much cleaner,
it's safe and more efficient to reuse the path object across the
multiple calls to ext4_ext_find_extent() in ext4_ext_shift_extents().

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-01 14:41:09 -04:00
Theodore Ts'o 10809df84a ext4: teach ext4_ext_find_extent() to realloc path if necessary
This adds additional safety in case for some reason we end reusing a
path structure which isn't big enough for current depth of the inode.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-01 14:40:09 -04:00
Theodore Ts'o b7ea89ad0a ext4: allow a NULL argument to ext4_ext_drop_refs()
Teach ext4_ext_drop_refs() to accept a NULL argument, much like
kfree().  This allows us to drop a lot of checks to make sure path is
non-NULL before calling ext4_ext_drop_refs().

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-01 14:39:09 -04:00
Theodore Ts'o 523f431ccf ext4: call ext4_ext_drop_refs() from ext4_ext_find_extent()
In nearly all of the calls to ext4_ext_find_extent() where the caller
is trying to recycle the path object, ext4_ext_drop_refs() gets called
to release the buffer heads before the path object gets overwritten.
To simplify things for the callers, and to avoid the possibility of a
memory leak, make ext4_ext_find_extent() responsible for dropping the
buffers.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-01 14:38:09 -04:00
Theodore Ts'o dfe5080939 ext4: drop EXT4_EX_NOFREE_ON_ERR from rest of extents handling code
Drop EXT4_EX_NOFREE_ON_ERR from ext4_ext_create_new_leaf(),
ext4_split_extent(), ext4_convert_unwritten_extents_endio().

This requires fixing all of their callers to potentially
ext4_ext_find_extent() to free the struct ext4_ext_path object in case
of an error, and there are interlocking dependencies all the way up to
ext4_ext_map_blocks(), ext4_swap_extents(), and
ext4_ext_remove_space().

Once this is done, we can drop the EXT4_EX_NOFREE_ON_ERR flag since it
is no longer necessary.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-01 14:37:09 -04:00
Theodore Ts'o 4f224b8b7b ext4: drop EXT4_EX_NOFREE_ON_ERR in convert_initialized_extent()
Transfer responsibility of freeing struct ext4_ext_path on error to
ext4_ext_find_extent().

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-01 14:36:09 -04:00
Theodore Ts'o e8b83d9303 ext4: collapse ext4_convert_initialized_extents()
The function ext4_convert_initialized_extents() is only called by a
single function --- ext4_ext_convert_initalized_extents().  Inline the
code and get rid of the unnecessary bits in order to simplify the code.

Rename ext4_ext_convert_initalized_extents() to
convert_initalized_extents() since it's a static function that is
actually only used in a single caller, ext4_ext_map_blocks().

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-01 14:35:09 -04:00
Theodore Ts'o 705912ca95 ext4: teach ext4_ext_find_extent() to free path on error
Right now, there are a places where it is all to easy to leak memory
on an error path, via a usage like this:

	struct ext4_ext_path *path = NULL

	while (...) {
		...
		path = ext4_ext_find_extent(inode, block, path, 0);
		if (IS_ERR(path)) {
			/* oops, if path was non-NULL before the call to
			   ext4_ext_find_extent, we've leaked it!  :-(  */
			...
			return PTR_ERR(path);
		}
		...
	}

Unfortunately, there some code paths where we are doing the following
instead:

	path = ext4_ext_find_extent(inode, block, orig_path, 0);

and where it's important that we _not_ free orig_path in the case
where ext4_ext_find_extent() returns an error.

So change the function signature of ext4_ext_find_extent() so that it
takes a struct ext4_ext_path ** for its third argument, and by
default, on an error, it will free the struct ext4_ext_path, and then
zero out the struct ext4_ext_path * pointer.  In order to avoid
causing problems, we add a flag EXT4_EX_NOFREE_ON_ERR which causes
ext4_ext_find_extent() to use the original behavior of forcing the
caller to deal with freeing the original path pointer on the error
case.

The goal is to get rid of EXT4_EX_NOFREE_ON_ERR entirely, but this
allows for a gentle transition and makes the patches easier to verify.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-01 14:34:09 -04:00
Theodore Ts'o bd30d702fc ext4: fix accidental flag aliasing in ext4_map_blocks flags
Commit b8a8684502 introduced an accidental flag aliasing between
EXT4_EX_NOCACHE and EXT4_GET_BLOCKS_CONVERT_UNWRITTEN.

Fortunately, this didn't introduce any untorward side effects --- we
got lucky.  Nevertheless, fix this and leave a warning to hopefully
avoid this from happening in the future.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-09-01 14:33:09 -04:00
Theodore Ts'o 713e8dde3e ext4: fix ZERO_RANGE bug hidden by flag aliasing
We accidently aliased EXT4_EX_NOCACHE and EXT4_GET_CONVERT_UNWRITTEN
falgs, which apparently was hiding a bug that was unmasked when this
flag aliasing issue was addressed (see the subsequent commit).  The
reproduction case was:

   fsx -N 10000 -l 500000 -r 4096 -t 4096 -w 4096 -Z -R -W /vdb/junk

... which would cause fsx to report corruption in the data file.

The fix we have is a bit of an overkill, but I'd much rather be
conservative for now, and we can optimize ZERO_RANGE_FL handling
later.  The fact that we need to zap the extent_status cache for the
inode is unfortunate, but correctness is far more important than
performance.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: Namjae Jeon <namjae.jeon@samsung.com>
2014-09-01 14:32:09 -04:00
Theodore Ts'o 19008f6dfa ext4: fix ext4_swap_extents() error handling
If ext4_ext_find_extent() returns an error, we have to clear path1 or
path2 or else we would end up trying to free an ERR_PTR, which would
be bad.

Also eliminate some redundant code and mark the error paths as unlikely()

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-31 15:03:14 -04:00
Dmitry Monakhov fcf6b1b729 ext4: refactor ext4_move_extents code base
ext4_move_extents is too complex for review. It has duplicate almost
each function available in the rest of other codebase. It has useless
artificial restriction orig_offset == donor_offset. But in fact logic
of ext4_move_extents is very simple:

Iterate extents one by one (similar to ext4_fill_fiemap_extents)
   ->Iterate each page covered extent (similar to generic_perform_write)
     ->swap extents for covered by page (can be shared with IOC_MOVE_DATA)

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-30 23:52:19 -04:00
Dmitry Monakhov f8fb4f4150 ext4: use ext4_ext_next_allocated_block instead of mext_next_extent
This allows us to make mext_next_extent static and potentially get rid
of it.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-30 23:50:56 -04:00
Dmitry Monakhov ee124d2746 ext4: use ext4_update_i_disksize instead of opencoded ones
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-30 23:34:06 -04:00
Wang Shilong 52c826db6d ext4: remove a duplicate call in ext4_init_new_dir()
ext4_journal_get_write_access() has just been called in ext4_append()
calling it again here is duplicated.

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-29 23:20:44 -04:00
Theodore Ts'o f8b3b59d4d ext4: convert do_split() to use the ERR_PTR convention
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-29 20:52:18 -04:00
Theodore Ts'o dd73b5d5cb ext4: convert dx_probe() to use the ERR_PTR convention
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-29 20:52:17 -04:00
Theodore Ts'o 1c2150283c ext4: convert ext4_bread() to use the ERR_PTR convention
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-29 20:52:15 -04:00
Theodore Ts'o 1056008226 ext4: convert ext4_getblk() to use the ERR_PTR convention
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-29 20:51:32 -04:00
Theodore Ts'o 537d8f9380 ext4: convert ext4_dx_find_entry() to use the ERR_PTR convention
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-29 20:49:51 -04:00
Linus Torvalds d4f03186c8 Ext4 bug fixes for 3.17, to provide better handling of memory
allocation failures, and to fix some journaling bugs involving journal
 checksums and FALLOC_FL_ZERO_RANGE.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJT/+hGAAoJENNvdpvBGATwlU8P/02752nzboRRtqYZBxh/rP6L
 QoawhKslb516QFcwgxBpmhf/uSg0XIIakANEFSvlJksj0hcSLNmHl3SjGB6EGyu0
 1qOjgXSULFFnLGkjJ9ptzn266irQRR2AX5+mBP1T/JV6L5dRFwylCWbSElxEjobt
 WhUe0TzXjazYviItOugh8tQYKrfWlfc0UnMSOU7abastStYkROPuvUUOg0fcQCW/
 jZpgFQDKO+TmIZ/QtP26Bogz27Cthe5d1XnA9555JOyYjxpRh3HnVaZXLXOtA2nf
 eQZmDpfXCnbqORLsqDQbq1+TMMFVjudQyIgHkmMojshTc2PWGZyl/KtgxDHCoBxz
 j3a/qafUPbkqEKTLOunDggkWvOKhah7Z6ZCxzamC3d5Cy2GtjUhhp+iyllf4Tmga
 OEWIPp/5F3/UfJj/0e3fcmj8tzTP8bOgVh4xC/Iwf3wugKzeGs9iaWEs02TJpCQk
 Yu+xqhHP05MGQuMXcQbPJy+DPq3a43Y/PBlzyF9ZmvJKqs0SxRIhgDnpRXDLla/m
 a2zYkzqZBog081idgy1KSJjL1XVBjHkcMwUaZ/mOCd/ok7pAhQIYPVeJEpBylf+l
 ABtTgn8qU6+QmkaeTypY8h3OAMES/PgA+PQp46zgmPJVopov0926QuMWzXb2Wbq9
 ZFGJziWdAZos5XWnMo6A
 =e6Ou
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 bugfixes from Ted Ts'o:
 "Ext4 bug fixes for 3.17, to provide better handling of memory
  allocation failures, and to fix some journaling bugs involving
  journal checksums and FALLOC_FL_ZERO_RANGE"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix same-dir rename when inline data directory overflows
  jbd2: fix descriptor block size handling errors with journal_csum
  jbd2: fix infinite loop when recovering corrupt journal blocks
  ext4: update i_disksize coherently with block allocation on error path
  ext4: fix transaction issues for ext4_fallocate and ext_zero_range
  ext4: fix incorect journal credits reservation in ext4_zero_range
  ext4: move i_size,i_disksize update routines to helper function
  ext4: fix BUG_ON in mb_free_blocks()
  ext4: propagate errors up to ext4_find_entry()'s callers
2014-08-29 11:52:46 -07:00
Linus Torvalds ef13c8afa6 Fix 3.17-rc1 regression introduced by switching the DM crypt target to
using per-bio data.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJUAHLMAAoJEMUj8QotnQNa2ooH/39NBVEhppKaIHzqR6Ps9mI/
 B8kH3eDo9gNK5RAvu7E6QEW3ASSEBVk15DtdjtcnSCzDGlz+cWYCp0KXeptt9GDH
 3DtIg2hKhVddl4XusgO/GpCYZjQR75LDnNryOZTia+dFogP3HWPhZpg7DtQ9o+Ac
 9FChLFHPDy/yQ4QYDuepL3TgeTIDJoQTRkGvzOeYXnsZHU2v2nTJin3qQetDhd51
 2OEedOdrJ9znkj5AI3xL5AXTwl7231c8JZrMbz0oKmUSCvbqY7rrgWr/dFZM+mIt
 OwY4KEDdI06iHuNc2LhzUjbr6GaqTAnB3qSAZ8cSNBLlI+Lg5TatFO7YluUDmD4=
 =6jY1
 -----END PGP SIGNATURE-----

Merge tag 'dm-3.17-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fix from Mike Snitzer:
 "Fix a 3.17-rc1 regression introduced by switching the DM crypt target
  to using per-bio data"

* tag 'dm-3.17-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm crypt: fix access beyond the end of allocated space
2014-08-29 11:49:10 -07:00
Linus Torvalds 522a15db95 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
 "A smaller collection of fixes that have come up since the initial
  merge window pull request.  This contains:

   - error handling cleanup and support for larger than 16 byte cdbs in
     sg_io() from Christoph.  The latter just matches what bsg and
     friends support, sg_io() got left out in the merge.

   - an option for brd to expose partitions in /proc/partitions.  They
     are hidden by default for compat reasons.  From Dmitry Monakhov.

   - a few blk-mq fixes from me - killing a dead/unused flag, fix for
     merging happening even if turned off, and correction of a few
     comments.

   - removal of unnecessary ->owner setting in systemace.  From Michal
     Simek.

   - two related fixes for a problem with nesting freezing of queues in
     blk-mq.  One from Ming Lei removing an unecessary freeze operation,
     and another from Tejun fixing the nesting regression introduced in
     the merge window.

   - fix for a BUG_ON() at bio_endio time when protection info is
     attached and the IO has an error.  From Sagi Grimberg.

   - two scsi_ioctl bug fixes for regressions with scsi-mq from Tony
     Battersby.

   - a cfq weight update fix and subsequent comment update from Toshiaki
     Makita"

* 'for-linus' of git://git.kernel.dk/linux-block:
  cfq-iosched: Add comments on update timing of weight
  cfq-iosched: Fix wrong children_weight calculation
  block: fix error handling in sg_io
  fix regression in SCSI_IOCTL_SEND_COMMAND
  scsi-mq: fix requests that use a separate CDB buffer
  block: support > 16 byte CDBs for SG_IO
  block: cleanup error handling in sg_io
  brd: add ram disk visibility option
  block: systemace: Remove .owner field for driver
  blk-mq: blk_mq_freeze_queue() should allow nesting
  blk-mq: correct a few wrong/bad comments
  block: Fix BUG_ON when pi errors occur
  blk-mq: don't allow merges if turned off for the queue
  blk-mq: get rid of unused BLK_MQ_F_SHOULD_SORT flag
  blk-mq: fix WARNING "percpu_ref_kill() called more than once!"
2014-08-29 11:21:49 -07:00
Will Deacon 9e36c63395 alpha: io: implement relaxed accessor macros for writes
write{b,w,l,q}_relaxed are implemented by some architectures in order to
permit memory-mapped I/O writes with weaker barrier semantics than the
non-relaxed variants.

This patch implements these write macros for Alpha, in the same vein as
the relaxed read macros, which are already implemented.

Acked-by: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-29 11:18:45 -07:00
Michael Cree 5691e4456a alpha: Wire up sched_setattr, sched_getattr, and renameat2 syscalls.
Signed-off-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-29 11:18:45 -07:00
Darrick J. Wong d80d448c6c ext4: fix same-dir rename when inline data directory overflows
When performing a same-directory rename, it's possible that adding or
setting the new directory entry will cause the directory to overflow
the inline data area, which causes the directory to be converted to an
extent-based directory.  Under this circumstance it is necessary to
re-read the directory when deleting the old dirent because the "old
directory" context still points to i_block in the inode table, which
is now an extent tree root!  The delete fails with an FS error, and
the subsequent fsck complains about incorrect link counts and
hardlinked directories.

Test case (originally found with flat_dir_test in the metadata_csum
test program):

# mkfs.ext4 -O inline_data /dev/sda
# mount /dev/sda /mnt
# mkdir /mnt/x
# touch /mnt/x/changelog.gz /mnt/x/copyright /mnt/x/README.Debian
# sync
# for i in /mnt/x/*; do mv $i $i.longer; done
# ls -la /mnt/x/
total 0
-rw-r--r-- 1 root root 0 Aug 25 12:03 changelog.gz.longer
-rw-r--r-- 1 root root 0 Aug 25 12:03 copyright
-rw-r--r-- 1 root root 0 Aug 25 12:03 copyright.longer
-rw-r--r-- 1 root root 0 Aug 25 12:03 README.Debian.longer

(Hey!  Why are there four files now??)

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2014-08-28 22:22:29 -04:00
Darrick J. Wong db9ee22036 jbd2: fix descriptor block size handling errors with journal_csum
It turns out that there are some serious problems with the on-disk
format of journal checksum v2.  The foremost is that the function to
calculate descriptor tag size returns sizes that are too big.  This
causes alignment issues on some architectures and is compounded by the
fact that some parts of jbd2 use the structure size (incorrectly) to
determine the presence of a 64bit journal instead of checking the
feature flags.

Therefore, introduce journal checksum v3, which enlarges the
descriptor block tag format to allow for full 32-bit checksums of
journal blocks, fix the journal tag function to return the correct
sizes, and fix the jbd2 recovery code to use feature flags to
determine 64bitness.

Add a few function helpers so we don't have to open-code quite so
many pieces.

Switching to a 16-byte block size was found to increase journal size
overhead by a maximum of 0.1%, to convert a 32-bit journal with no
checksumming to a 32-bit journal with checksum v3 enabled.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reported-by: TR Reardon <thomas_reardon@hotmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2014-08-28 22:22:29 -04:00
Darrick J. Wong 022eaa7517 jbd2: fix infinite loop when recovering corrupt journal blocks
When recovering the journal, don't fall into an infinite loop if we
encounter a corrupt journal block.  Instead, just skip the block and
return an error, which fails the mount and thus forces the user to run
a full filesystem fsck.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2014-08-28 22:22:28 -04:00
Dmitry Monakhov 6603120e96 ext4: update i_disksize coherently with block allocation on error path
In case of delalloc block i_disksize may be less than i_size. So we
have to update i_disksize each time we allocated and submitted some
blocks beyond i_disksize.  We weren't doing this on the error paths,
so fix this.

testcase: xfstest generic/019

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2014-08-28 22:20:41 -04:00
Mikulas Patocka d49ec52ff6 dm crypt: fix access beyond the end of allocated space
The DM crypt target accesses memory beyond allocated space resulting in
a crash on 32 bit x86 systems.

This bug is very old (it dates back to 2.6.25 commit 3a7f6c990a "dm
crypt: use async crypto").  However, this bug was masked by the fact
that kmalloc rounds the size up to the next power of two.  This bug
wasn't exposed until 3.17-rc1 commit 298a9fa08a ("dm crypt: use per-bio
data").  By switching to using per-bio data there was no longer any
padding beyond the end of a dm-crypt allocated memory block.

To minimize allocation overhead dm-crypt puts several structures into one
block allocated with kmalloc.  The block holds struct ablkcipher_request,
cipher-specific scratch pad (crypto_ablkcipher_reqsize(any_tfm(cc))),
struct dm_crypt_request and an initialization vector.

The variable dmreq_start is set to offset of struct dm_crypt_request
within this memory block.  dm-crypt allocates the block with this size:
cc->dmreq_start + sizeof(struct dm_crypt_request) + cc->iv_size.

When accessing the initialization vector, dm-crypt uses the function
iv_of_dmreq, which performs this calculation: ALIGN((unsigned long)(dmreq
+ 1), crypto_ablkcipher_alignmask(any_tfm(cc)) + 1).

dm-crypt allocated "cc->iv_size" bytes beyond the end of dm_crypt_request
structure.  However, when dm-crypt accesses the initialization vector, it
takes a pointer to the end of dm_crypt_request, aligns it, and then uses
it as the initialization vector.  If the end of dm_crypt_request is not
aligned on a crypto_ablkcipher_alignmask(any_tfm(cc)) boundary the
alignment causes the initialization vector to point beyond the allocated
space.

Fix this bug by calculating the variable iv_size_padding and adding it
to the allocated size.

Also correct the alignment of dm_crypt_request.  struct dm_crypt_request
is specific to dm-crypt (it isn't used by the crypto subsystem at all),
so it is aligned on __alignof__(struct dm_crypt_request).

Also align per_bio_data_size on ARCH_KMALLOC_MINALIGN, so that it is
aligned as if the block was allocated with kmalloc.

Reported-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
Tested-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-08-28 14:24:09 -04:00
Linus Torvalds 59753a8054 One simple fix to invalidate GPIO non-request.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJT/00OAAoJEFGvii+H/Hdhv44P/30y36qDKV7bMUmICRtw/Cb0
 XwAfaBR75bLxVVmF/VtYPpuKJ4wTHGl3bg2Tx0XehcyuFYYgLglz/WsK6y7VYFCQ
 Yw/1QqalWWrW+jPJiaofQpL3P3Q12RoIvkdBI1L5KnDEvipdgHlXDkDwvd1L/JHc
 wYJvM2LVL4PCRc3x/7+L3pRhdyeCFpPgvNDk5Y4xbGOU8eoQd7fwQz8EJgmEEQzC
 Kv7mHUqi04lE1DnYIvbJ/YL1tSVYXRaCBRwWk4XtBTGYUi/FQ/edFoa0iXy/ym59
 u21zoNiYGxBpdRXEX+LbpJs0YF0rMHxYqpGgUdSJwU1YTF7dvqaZCmo5Qmnv6akE
 VF8nu1yvMPaPLcVuxuf7ttnZZsmwpD6BG1P8VHAwxWl6S2Fjkr9f9ommYLMF8cIL
 cx0PJkUgNHwHH2Zs56DhZjfGCvl9El9XFxL4pnGxCg94amiCKnbf6LeW/bPN9uxh
 4e1FU3+FQ+SAIcPmPRRfae5tM+zEFRyA7lNDEYXkETZ91v7hmykAi7kijw7D6RHc
 G4toOUpE2koIxlx/k4th8hrIwNexvKPq5fowHC3E6vLcXmQwT4Y5H1HxmXZnw36T
 qnLI2HBbUoBBuxLmXN64WOZHcPr5GNlS+9RpgsB7JnkH3E2jpB8/J1NuX5lOPowR
 pQqUmGr53NB9MVobkYS5
 =tMZh
 -----END PGP SIGNATURE-----

Merge tag 'backlight-fixes-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight

Pull backlight fix from Lee Jones:
 "One simple fix to invalidate GPIO non-request"

* tag 'backlight-fixes-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
  pwm-backlight: Fix bogus request for GPIO#0 when instantiated from DT
2014-08-28 10:47:10 -07:00
Linus Torvalds 2db3cff2d3 Couple of simple fixes due for the 3.17 rcs
(and a sneaky document addition that slipped from the previous pull-request)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJT/0s6AAoJEFGvii+H/HdhNiEP/R+zAKkTCmJyw2aeJR9RrNaO
 Qu2rzavdCsM/DKJiEKsF0QNvyMFM3S1Tx16pnOWcsCdJI9LVgeCRv7Fpn2mJKNRI
 UlmRHuG+SRu2nzr0gc+hyEOuz5VhpGemKo4tntUEPKrj5eXD25e3QgYHFIdTI32k
 c/bG7eKV7LbPSrybzLNw6FNRuH0YyN667PEck1HVFerCv/921LwO4seOVPAwecgu
 tELpaY7oxDTxoPe8QmxzmPWvCri1OjGaQTEfHoCWXfUaeYcOksI4mJ1zoIBZ0aHb
 uSY+wVnXfkAbmz7kzcJJvVYuf4PhI0mRh2uEhpvzJxMaetAA3JVf0/JXOhTGbATK
 itWgSqCVXooMf+8DIEsMkmXgqV0+Nvy0vZxv+viIdQ4ea9nUrWVT+KJmYsKp8lk9
 SAXsDv+4sr3POCN4QwvAsEKmujtngkOws2YSRRZqiPw8DAiNVpjgF/bZ6jZXVvQs
 wpOqGFJhwS3etyBica1hEZorhGoQzLEWgXDa+D+9jPs1TWVBJVYKeGaRFPGVZxgN
 6mAObpbv0/Uc6SUUlv55qfF3A1KUE2n7001SoD7zYEcTQ9jIcqw8CgKlb4kgrROv
 VeFb4wIxF6kGuijiKMZvwoH0x+QL/fXrvIVK6x8UN2Zy3+QMva+x8pu8tZQvTYun
 CibJKkoCwT2dth7D9pWS
 =DaB2
 -----END PGP SIGNATURE-----

Merge tag 'mfd-fixes-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd

Pull mfd fixes from Lee Jones:
 "Couple of simple fixes due for the 3.17 rcs

  (and a sneaky document addition that slipped from the previous
  pull-request)"

* tag 'mfd-fixes-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
  mfd: twl4030-power: Fix PM idle pin configuration to not conflict with regulators
  mfd: tc3589x: Add device tree bindings
  mfd: ab8500-core: Use 'ifdef' for config options
  mfd: htc-i2cpld: Fix %d confusingly prefixed with 0x in format string
  mfd: omap-usb-host: Fix %d confusingly prefixed with 0x in format string
2014-08-28 10:46:25 -07:00
Linus Torvalds 0caf14e66a Pin control fixes for the v3.17 series, only driver fixes:
- SH-PFC (Renesas) r8a7791 CAN bus pin group problem
 - Rockchip (GPIO0 configuration)
 - Tegra-xusb (interrupt handling)
 - Exynos (GPIO interrupt locking)
 - Qualcomm (fix misleading example interrupts)
 - Minor non-critical fixes for abx500 and AT91 also sneaked in,
   because I initially intended this pull for post RC-1, hope it's
   still OK.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJT/yB/AAoJEEEQszewGV1zV0wP/3QPm5P3bWLGLQNjEldlqX9u
 WHW8TetInUiylTzdV0klJcucGOj6TsKMuS+UEOPGg7dZ8cQ2JQ3CUgB7i8L6+Rhw
 cgbSyAtdos89xMTsu6WhR4x/1IVXE8KeitgPcshdOvhJM/b9OVowss2Y2gG2/fsZ
 duw5zN0Fj0R02BclJ5hj9GHUrnDfbk8VPvExdTHld2rjAZ4/CtL+zHbktOh44HgG
 C55HHbMSOk52U+/NYUJW6Yc7iFfXOpKJzmv+No8TdwaY67V80mrf3HjVIEoUCOh0
 DZiwRwr0nmnbtog2c8XLD+oQLpnVgbJsNub/1h+bvn+uHi5J2K746iSk5BDBhuzP
 PsK7hXTtFKvdpW762nqT+WRu+oDY4oor3YG6W/y5MQAHX513qNU+z5RiN90UV8/V
 WHz6X4qq52amv0larVUf4STK5dBcoEAQrbiI51g5CLnBQGCOj264go6ZWdgWPCT+
 rhu4fT8qprbBl8TOrKBUJXP4ZzesqTn5t0bVlVt+H153DCRgLVNw3C+xo2zHIlVz
 2Fz9oCcALdGPxmkEK2vdTRtGcvW3Y3FdwFCdkCTr4uSfyDzUWbdUvOwKEFYiW8+3
 FJsgOE6SDIzgNU5C7fF3g0DAoxesXVr1y+sZNVaVX93cIDk7LWs6xgWUipVg/Abp
 x9sm+cIYwWdpgCuWhULh
 =Aivv
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin-control fixes from Linus Walleij:
 "My first (a bit delayed) pack of pin control fixes for the v3.17
  series, only driver fixes:

   - SH-PFC (Renesas) r8a7791 CAN bus pin group problem
   - Rockchip (GPIO0 configuration)
   - Tegra-xusb (interrupt handling)
   - Exynos (GPIO interrupt locking)
   - Qualcomm (fix misleading example interrupts)
   - minor non-critical fixes for abx500 and AT91 also sneaked in,
     because I initially intended this pull for post RC-1, hope it's
     still OK"

* tag 'pinctrl-v3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: qcom: apq8064: Correct interrupts in example
  pinctrl: exynos: Lock GPIOs as interrupts when used as EINTs
  pinctrl: pinctrl-at91.c: fix decimal printf format specifiers prefixed with 0x
  pinctrl: abx500: remove useless check
  pinctrl: tegra-xusb: testing wrong variable in probe()
  pinctrl: tegra-xusb: fix an off by one test
  pinctrl: rockchip: fix rk3288 gpio0 configuration
  sh-pfc: r8a7791: fix CAN pin groups
2014-08-28 10:31:29 -07:00
Linus Torvalds daf543b177 Small dma-buf pull request for 3.17-rc3
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJT/tOwAAoJEAG+/NWsLn5bzKwQAPIfgjf97bdZPl6wqKtegHk4
 kYxGb9mLTZ0p3zqQt4msmbiiAjd1J0GWaVfyBWzjIHg0QSZYKwwFKU9RxclS+NfL
 SvBQWDsj9EC0UBZOqWHaiJ1zGlKnOvS5mA/ldkFnqhhivCdfO/TMkbNAHT2Qm0sI
 JLAVkamfJP+LspL9/kG99MjQaVNKkLvnroOly16c2TD9B3NNkr9mxMrOhVyU+L+X
 XCPB9Vu38+xHCSCrV9Rn7BNCJcflHoqG3KIGGNoU+VF0FRY+nV4eZAuh1KbhseEP
 Ibo/Zjf0m67Mxq00lk3VyD61rsV6dcuqNk4YPGDV1/dS2IUCXV1JHcQ/6pOBdX9S
 c71yaf+YrByNsiXAvClT7oPFQcSVAPhwhxM2dyKdtnPzA59GQGutXJtBjBCC0bB+
 6MD9RkjFFAZRTQTb63J/8eMFrJ84wc8DZdL8x2PCq4Bf5X2jlbHOlbHM92LVflsD
 DrSpgnrKdSC08JazaP6Yhg46q801p3+Z07KxM6zecOyjwI27B+/Cqih57bPgKb7a
 cVrFHzQmQfW0RDqg3nr/UZxSNDQALQj43k2/v7ktoYnluK9OgBgWE+bmqEjOKNyk
 PbJl83brz1Ot/RkEBAtK0FgdO17Anz/42Q8n1k6cUEw33r1v9s+iTqMu8x143fLO
 bI0F5Q44LBd08Dgy128v
 =Tc07
 -----END PGP SIGNATURE-----

Merge tag 'for-3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf

Pull dma-buf fixes from Sumit Semwal:
 "The major changes for 3.17 already went via Greg-KH's tree this time
  as well; this is a small pull request for dma-buf - all documentation
  related"

* tag 'for-3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf:
  dma-buf/fence: Fix one more kerneldoc warning
  dma-buf/fence: Fix a kerneldoc warning
  Documentation/dma-buf-sharing.txt: update API descriptions
2014-08-28 10:30:25 -07:00
Linus Torvalds 521bd5e4d9 sound fixes for 3.17-rc3
Here contains not many exciting changes but just a few minor ones:
 An off-by-one proc write fix, a couple of trivial incldue guard
 fixes, Acer laptop pinconfig fix, and a fix for DSD formats that
 are still rarely used.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJT/w1wAAoJEGwxgFQ9KSmkkMAQAKlRFjLRrcM+y4ly8Gjr1ZTe
 QjkdPBdkQlWiMIPLlX1Xr+pWG5kZiUMiAl5lBkiF6ubcjVYl6KoJGzGzwcAdYkor
 H0tY2+mqinOS9gi3qwnTQjvlhGeTwLqs7hIkIjfaaUHSBn5I6TYsMgMQ2I6Rmo82
 Ox6bhJbunNRhCpyebzTjzgcruGej8FkzpJullWs6XTdxCY2rtFpVn0b6FUgdbab+
 uTgfBeckvtIA327s7qRmWyAOn5t73tCqV3CJ/PnBCXByODiMRrkaM0OLb/O7QvN5
 VTrKyGQyUhf3WNT6R1nuGPbC24ajxu5p0GkSZNuHWoerMIgNobtoBNplzL+P8UUs
 s83Sm21Y7n2jSXPs+rJdjy4MdLng6QcXH/ZoCnpZeIeVxs4qU4O9Q7HMDJZCYS7M
 EEutl8/gt43su4wHO2RGfU3DOIFnjtPQzqqkkwzdwoxB9jhMHfq73qRTrAHC3biB
 hqjRQRcX++Z5C0PPuVUvRSwidWTbeEfot3MvXKXaHyWTojyGRqfKsCXhudFLCNYU
 BlwAxrLI8kWaKdr9oVe+KS1XoBVQJIIar62plwFozTEuHuQ2P0FNpP6MFViPNoFf
 UOHvXCTHi8JaVSBjhlKHrtto2Zcak1tIlQ1Ewog0Wg1cJrDOdWxHEwJ2tUI0N7aC
 fZ3lwxhRarA9URNrjNqV
 =raXL
 -----END PGP SIGNATURE-----

Merge tag 'sound-3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Here contains not many exciting changes but just a few minor ones: An
  off-by-one proc write fix, a couple of trivial incldue guard fixes,
  Acer laptop pinconfig fix, and a fix for DSD formats that are still
  rarely used"

* tag 'sound-3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Set up initial pins for Acer Aspire V5
  ALSA: pcm: Fix the silence data for DSD formats
  ALSA: ctxfi: ct20k1reg: Fix typo in include guard
  ALSA: hda: ca0132_regs.h: Fix typo in include guard
  ALSA: core: fix buffer overflow in snd_info_get_line()
2014-08-28 09:44:25 -07:00
Linus Torvalds 6e8f7b09e4 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "Nothing major, one core oops fixes, some radeon oops fixes, some sti
  driver fixups, msm driver fixes and a minor Kconfig update for the ww
  mutex debugging"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/ast: Add missing entry to dclk_table[]
  drm: fix division-by-zero on dumb_create()
  ww-mutex: clarify help text for DEBUG_WW_MUTEX_SLOWPATH
  radeon: Test for PCI root bus before assuming bus->self
  drm/radeon: handle broken disabled rb mask gracefully (6xx/7xx) (v2)
  drm/radeon: save/restore the PD addr on suspend/resume
  drm/msm: Fix missing unlock on error in msm_fbdev_create()
  drm/msm: fix compile error for non-dt builds
  drm/msm/mdp4: request vblank during modeset
  drm/msm: avoid flood of kernel logs on faults
  drm: sti: Add missing dependency on RESET_CONTROLLER
  drm: sti: Make of_device_id array const
  drm: sti: Fix return value check in sti_drm_platform_probe()
  drm: sti: hda: fix return value check in sti_hda_probe()
  drm: sti: hdmi: fix return value check in sti_hdmi_probe()
  drm: sti: tvout: fix return value check in sti_tvout_probe()
2014-08-28 09:40:37 -07:00
Tony Lindgren daebabd578 mfd: twl4030-power: Fix PM idle pin configuration to not conflict with regulators
Commit 43fef47f94 (mfd: twl4030-power: Add a configuration to turn
off oscillator during off-idle) added support for configuring the PMIC
to cut off resources during deeper idle states to save power.

This however caused regression for n900 display power that needed the
PMIC configuration to be disabled with commit d937678ab6 (ARM: dts:
Revert enabling of twl configuration for n900).

Turns out the root cause of the problem is that we must use
TWL4030_RESCONFIG_UNDEF instead of DEV_GRP_NULL to avoid disabling
regulators that may have been enabled before the init function
for twl4030-power.c runs. With TWL4030_RESCONFIG_UNDEF we let the
regulator framework control the regulators like it should. Here we
need to only configure the sys_clken and sys_off_mode triggers for
the regulators that cannot be done by the regulator framework as
it's not running at that point.

This allows us to enable the PMIC configuration for n900.

Fixes: 43fef47f94 (mfd: twl4030-power: Add a configuration to turn off oscillator during off-idle)

Cc: stable@vger.kernel.org # v3.16
Signed-off-by: Tony Lindgren <tony@atomide.com>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2014-08-28 15:57:55 +01:00
Linus Walleij bc80436033 mfd: tc3589x: Add device tree bindings
This defines the device tree bindings for the Toshiba TC3589x
series of multi-purpose expanders. Only the stuff I can test
is defined: GPIO and keypad. Others may implement more
subdevices further down the road.

This is to complement
commit a435ae1d51
"mfd: Enable the tc3589x for Device Tree" which left off
the definition of the device tree bindings.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2014-08-28 15:57:54 +01:00
Toshiaki Makita 7b5af5cffc cfq-iosched: Add comments on update timing of weight
Explain that weight has to be updated on activation.
This complements previous fix e15693ef18 ("cfq-iosched: Fix wrong
children_weight calculation").

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-08-28 08:16:29 -06:00