linux_old1

Commit Graph

Author	SHA1	Message	Date
David Howells	a528d35e8b	statx: Add a system call to make enhanced file info available Add a system call to make extended file information available, including file creation and some attribute flags where available through the underlying filesystem. The getattr inode operation is altered to take two additional arguments: a u32 request_mask and an unsigned int flags that indicate the synchronisation mode. This change is propagated to the vfs_getattr() function. Functions like vfs_stat() are now inline wrappers around new functions vfs_statx() and vfs_statx_fd() to reduce stack usage. ======== OVERVIEW ======== The idea was initially proposed as a set of xattrs that could be retrieved with getxattr(), but the general preference proved to be for a new syscall with an extended stat structure. A number of requests were gathered for features to be included. The following have been included: (1) Make the fields a consistent size on all arches and make them large. (2) Spare space, request flags and information flags are provided for future expansion. (3) Better support for the y2038 problem [Arnd Bergmann] (tv_sec is an __s64). (4) Creation time: The SMB protocol carries the creation time, which could be exported by Samba, which will in turn help CIFS make use of FS-Cache as that can be used for coherency data (stx_btime). This is also specified in NFSv4 as a recommended attribute and could be exported by NFSD [Steve French]. (5) Lightweight stat: Ask for just those details of interest, and allow a netfs (such as NFS) to approximate anything not of interest, possibly without going to the server [Trond Myklebust, Ulrich Drepper, Andreas Dilger] (AT_STATX_DONT_SYNC). (6) Heavyweight stat: Force a netfs to go to the server, even if it thinks its cached attributes are up to date [Trond Myklebust] (AT_STATX_FORCE_SYNC). And the following have been left out for future extension: (7) Data version number: Could be used by userspace NFS servers [Aneesh Kumar]. Can also be used to modify fill_post_wcc() in NFSD which retrieves i_version directly, but has just called vfs_getattr(). It could get it from the kstat struct if it used vfs_xgetattr() instead. (There's disagreement on the exact semantics of a single field, since not all filesystems do this the same way). (8) BSD stat compatibility: Including more fields from the BSD stat such as creation time (st_btime) and inode generation number (st_gen) [Jeremy Allison, Bernd Schubert]. (9) Inode generation number: Useful for FUSE and userspace NFS servers [Bernd Schubert]. (This was asked for but later deemed unnecessary with the open-by-handle capability available and caused disagreement as to whether it's a security hole or not). (10) Extra coherency data may be useful in making backups [Andreas Dilger]. (No particular data were offered, but things like last backup timestamp, the data version number and the DOS archive bit would come into this category). (11) Allow the filesystem to indicate what it can/cannot provide: A filesystem can now say it doesn't support a standard stat feature if that isn't available, so if, for instance, inode numbers or UIDs don't exist or are fabricated locally... (This requires a separate system call - I have an fsinfo() call idea for this). (12) Store a 16-byte volume ID in the superblock that can be returned in struct xstat [Steve French]. (Deferred to fsinfo). (13) Include granularity fields in the time data to indicate the granularity of each of the times (NFSv4 time_delta) [Steve French]. (Deferred to fsinfo). (14) FS_IOC_GETFLAGS value. These could be translated to BSD's st_flags. Note that the Linux IOC flags are a mess and filesystems such as Ext4 define flags that aren't in linux/fs.h, so translation in the kernel may be a necessity (or, possibly, we provide the filesystem type too). (Some attributes are made available in stx_attributes, but the general feeling was that the IOC flags were to ext[234]-specific and shouldn't be exposed through statx this way). (15) Mask of features available on file (eg: ACLs, seclabel) [Brad Boyer, Michael Kerrisk]. (Deferred, probably to fsinfo. Finding out if there's an ACL or seclabal might require extra filesystem operations). (16) Femtosecond-resolution timestamps [Dave Chinner]. (A __reserved field has been left in the statx_timestamp struct for this - if there proves to be a need). (17) A set multiple attributes syscall to go with this. =============== NEW SYSTEM CALL =============== The new system call is: int ret = statx(int dfd, const char filename, unsigned int flags, unsigned int mask, struct statx buffer); The dfd, filename and flags parameters indicate the file to query, in a similar way to fstatat(). There is no equivalent of lstat() as that can be emulated with statx() by passing AT_SYMLINK_NOFOLLOW in flags. There is also no equivalent of fstat() as that can be emulated by passing a NULL filename to statx() with the fd of interest in dfd. Whether or not statx() synchronises the attributes with the backing store can be controlled by OR'ing a value into the flags argument (this typically only affects network filesystems): (1) AT_STATX_SYNC_AS_STAT tells statx() to behave as stat() does in this respect. (2) AT_STATX_FORCE_SYNC will require a network filesystem to synchronise its attributes with the server - which might require data writeback to occur to get the timestamps correct. (3) AT_STATX_DONT_SYNC will suppress synchronisation with the server in a network filesystem. The resulting values should be considered approximate. mask is a bitmask indicating the fields in struct statx that are of interest to the caller. The user should set this to STATX_BASIC_STATS to get the basic set returned by stat(). It should be noted that asking for more information may entail extra I/O operations. buffer points to the destination for the data. This must be 256 bytes in size. ====================== MAIN ATTRIBUTES RECORD ====================== The following structures are defined in which to return the main attribute set: struct statx_timestamp { __s64 tv_sec; __s32 tv_nsec; __s32 __reserved; }; struct statx { __u32 stx_mask; __u32 stx_blksize; __u64 stx_attributes; __u32 stx_nlink; __u32 stx_uid; __u32 stx_gid; __u16 stx_mode; __u16 __spare0[1]; __u64 stx_ino; __u64 stx_size; __u64 stx_blocks; __u64 __spare1[1]; struct statx_timestamp stx_atime; struct statx_timestamp stx_btime; struct statx_timestamp stx_ctime; struct statx_timestamp stx_mtime; __u32 stx_rdev_major; __u32 stx_rdev_minor; __u32 stx_dev_major; __u32 stx_dev_minor; __u64 __spare2[14]; }; The defined bits in request_mask and stx_mask are: STATX_TYPE Want/got stx_mode & S_IFMT STATX_MODE Want/got stx_mode & ~S_IFMT STATX_NLINK Want/got stx_nlink STATX_UID Want/got stx_uid STATX_GID Want/got stx_gid STATX_ATIME Want/got stx_atime{,_ns} STATX_MTIME Want/got stx_mtime{,_ns} STATX_CTIME Want/got stx_ctime{,_ns} STATX_INO Want/got stx_ino STATX_SIZE Want/got stx_size STATX_BLOCKS Want/got stx_blocks STATX_BASIC_STATS [The stuff in the normal stat struct] STATX_BTIME Want/got stx_btime{,_ns} STATX_ALL [All currently available stuff] stx_btime is the file creation time, stx_mask is a bitmask indicating the data provided and __spares[] are where as-yet undefined fields can be placed. Time fields are structures with separate seconds and nanoseconds fields plus a reserved field in case we want to add even finer resolution. Note that times will be negative if before 1970; in such a case, the nanosecond fields will also be negative if not zero. The bits defined in the stx_attributes field convey information about a file, how it is accessed, where it is and what it does. The following attributes map to FS__FL flags and are the same numerical value: STATX_ATTR_COMPRESSED File is compressed by the fs STATX_ATTR_IMMUTABLE File is marked immutable STATX_ATTR_APPEND File is append-only STATX_ATTR_NODUMP File is not to be dumped STATX_ATTR_ENCRYPTED File requires key to decrypt in fs Within the kernel, the supported flags are listed by: KSTAT_ATTR_FS_IOC_FLAGS [Are any other IOC flags of sufficient general interest to be exposed through this interface?] New flags include: STATX_ATTR_AUTOMOUNT Object is an automount trigger These are for the use of GUI tools that might want to mark files specially, depending on what they are. Fields in struct statx come in a number of classes: (0) stx_dev_, stx_blksize. These are local system information and are always available. (1) stx_mode, stx_nlinks, stx_uid, stx_gid, stx_[amc]time, stx_ino, stx_size, stx_blocks. These will be returned whether the caller asks for them or not. The corresponding bits in stx_mask will be set to indicate whether they actually have valid values. If the caller didn't ask for them, then they may be approximated. For example, NFS won't waste any time updating them from the server, unless as a byproduct of updating something requested. If the values don't actually exist for the underlying object (such as UID or GID on a DOS file), then the bit won't be set in the stx_mask, even if the caller asked for the value. In such a case, the returned value will be a fabrication. Note that there are instances where the type might not be valid, for instance Windows reparse points. (2) stx_rdev_*. This will be set only if stx_mode indicates we're looking at a blockdev or a chardev, otherwise will be 0. (3) stx_btime. Similar to (1), except this will be set to 0 if it doesn't exist. ======= TESTING ======= The following test program can be used to test the statx system call: samples/statx/test-statx.c Just compile and run, passing it paths to the files you want to examine. The file is built automatically if CONFIG_SAMPLES is enabled. Here's some example output. Firstly, an NFS directory that crosses to another FSID. Note that the AUTOMOUNT attribute is set because transiting this directory will cause d_automount to be invoked by the VFS. [root@andromeda ~]# /tmp/test-statx -A /warthog/data statx(/warthog/data) = 0 results=7ff Size: 4096 Blocks: 8 IO Block: 1048576 directory Device: 00:26 Inode: 1703937 Links: 125 Access: (3777/drwxrwxrwx) Uid: 0 Gid: 4041 Access: 2016-11-24 09:02:12.219699527+0000 Modify: 2016-11-17 10:44:36.225653653+0000 Change: 2016-11-17 10:44:36.225653653+0000 Attributes: 0000000000001000 (-------- -------- -------- -------- -------- -------- ---m---- --------) Secondly, the result of automounting on that directory. [root@andromeda ~]# /tmp/test-statx /warthog/data statx(/warthog/data) = 0 results=7ff Size: 4096 Blocks: 8 IO Block: 1048576 directory Device: 00:27 Inode: 2 Links: 125 Access: (3777/drwxrwxrwx) Uid: 0 Gid: 4041 Access: 2016-11-24 09:02:12.219699527+0000 Modify: 2016-11-17 10:44:36.225653653+0000 Change: 2016-11-17 10:44:36.225653653+0000 Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-03-02 20:51:15 -05:00
Boris Brezillon	7554769641	UBI: provide an helper to check whether a LEB is mapped or not This is part of the process of hiding UBI EBA's internal to other part of the UBI implementation, so that we can add new information to the EBA table without having to patch different places in the UBI code. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2016-10-02 22:48:14 +02:00
Boris Brezillon	9a5f09ac0a	UBI: add an helper to check lnum validity ubi_leb_valid() is here to replace the lnum < 0 \|\| lnum >= vol->reserved_pebs checks. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2016-10-02 22:48:14 +02:00
Richard Weinberger	61edc3f3b5	ubi: Don't bypass ->getattr() Directly accessing inode fields bypasses ->getattr() and can cause problems when the underlying filesystem does not have the default ->getattr() implementation. So instead of obtaining the backing inode via d_backing_inode() use vfs_getattr() and obtain what we need from the kstat struct. Cc: Al Viro <viro@zeniv.linux.org.uk> Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Richard Weinberger <richard@nod.at>	2016-06-14 10:51:42 +02:00
Richard Weinberger	ad022c8718	Revert "mtd: switch ubi_open_volume_path() to vfs_stat()" This reverts commit `322ea0bbf3`. vfs_stat() can only be used on user supplied buffers. UBI's kapi.c is the API to the kernel and therefore vfs_stat() is inappropriate. This solves the problem that mounting any UBIFS will immediately fail with -EINVAL. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Richard Weinberger <richard@nod.at>	2016-06-14 10:51:42 +02:00
Linus Torvalds	23a3e178b9	This pull request contains mostly cleanups and minor improvements of UBI and UBIFS. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJXSJl6AAoJEEtJtSqsAOnWbbAP/2ls5KpGsRSoOP4NsKo5+8k9 GsZ8hb0iN+UGDxMB5Jlj6O1ISNPz/8o7iGBuWa5OWFdh4Nn38sA1Qv996Rg3Ca3O 8KYEGAD7POWANfxCTmyQX/L8AsOP62B3diFktPyGrtYmLsVzx/AFN9/nM27ticdF PQpUtLwvL/m7/oE6ymFCB4x34+DkTa7Jx4e+chVlF61CUipGc5I60VVX1Do+/DWt S37ajjIxMJx0BO7Zs6vrcH9uGFSbvSpVBr9/sC16t0Fa1/vAa9pXieg3y5XZmtOl +6NlwUxXVu5IjHkYYGZP8jsrA7bKJlflgCo/w6WKC6V0KECnBx+oSG9uMjDVPBSM rc4VF4nevnNLNqFGBRWhxiYrRUCdB1TcWVDOzM16peNeUNJiWj/+4PbsiJQBwECH ZnE/dBrjU7v+J8UmHsUJSveW6/5PluJf1loDzrip7gk5QD13wdpPo3VOycOj1vMD iS6H/TBjtcEzWWh7qcBbtgkbHQH0g+w9YSzVy7ODFweuoq+4qCZ1davqzzZaRv3q hylor1gea3b4GnH1EDjmXyQOOuIf2rLdJhou8WoZhmy2q3FrnE+89/CCOCMkeQMQ qQr1lAbxD7A+8bYwSZ7Nvv6lrBUrat59DV6FHFc69MdkGT7J6z85ZOyd3lxRvJc/ XiZZkKhrYSKgV/iLFhHW =L16y -----END PGP SIGNATURE----- Merge tag 'upstream-4.7-rc1' of git://git.infradead.org/linux-ubifs Pull UBI/UBIFS updates from Richard Weinberger: "This contains mostly cleanups and minor improvements of UBI and UBIFS" * tag 'upstream-4.7-rc1' of git://git.infradead.org/linux-ubifs: ubifs: ubifs_dump_inode: Fix dumping field bulk_read UBI: Fix static volume checks when Fastmap is used UBI: Set free_count to zero before walking through erase list UBI: Silence an unintialized variable warning UBI: Clean up return in ubi_remove_volume() UBI: Modify wrong comment in ubi_leb_map function. UBI: Don't read back all data in ubi_eba_copy_leb() UBI: Add ro-mode sysfs attribute	2016-05-27 18:49:29 -07:00
z00189512	960b35d06b	UBI: Modify wrong comment in ubi_leb_map function. Signed-off-by: z00189512 <abc.zhangliang@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2016-05-24 15:24:20 +02:00
Al Viro	322ea0bbf3	mtd: switch ubi_open_volume_path() to vfs_stat() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-03-27 23:49:17 -04:00
David Howells	bb668734c4	VFS: assorted d_backing_inode() annotations Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-15 15:06:59 -04:00
Richard Weinberger	9ff08979e1	UBI: Add initial support for scatter gather Adds a new set of functions to deal with scatter gather. ubi_eba_read_leb_sg() will read from a LEB into a scatter gather list. The new data structure struct ubi_sgl will be used within UBI to hold the scatter gather list itself and metadata to have a cursor within the list. Signed-off-by: Richard Weinberger <richard@nod.at> Tested-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> Reviewed-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>	2015-01-28 16:04:26 +01:00
Richard Weinberger	fafdd2bf26	UBI: Implement UBI_METAONLY UBI_METAONLY is a new open mode for UBI volumes, it indicates that only meta data is being changed. Meta data in terms of UBI volumes means data which is stored in the UBI volume table but not on the volume itself. While it does not interfere with UBI_READONLY and UBI_READWRITE it is not allowed to use UBI_METAONLY together with UBI_EXCLUSIVE. Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> Cc: Andrew Murray <amurray@embedded-bits.co.uk> Signed-off-by: Richard Weinberger <richard@nod.at> Tested-by: Guido Martínez <guido@vanguardiasur.com.ar> Reviewed-by: Guido Martínez <guido@vanguardiasur.com.ar> Tested-by: Christoph Fritz <chf.fritz@googlemail.com> Tested-by: Andrew Murray <amurray@embedded-bits.co.uk>	2015-01-28 15:57:04 +01:00
Tanya Brokhman	3260870331	UBI: Extend UBI layer debug/messaging capabilities If there is more then one UBI device mounted, there is no way to distinguish between messages from different UBI devices. Add device number to all ubi layer message types. The R/O block driver messages were replaced by pr_* since ubi_device structure is not used by it. Amended a bit by Artem. Signed-off-by: Tanya Brokhman <tlinder@codeaurora.org> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>	2014-11-07 12:08:51 +02:00
Joel Reardon	62f384552b	UBI: modify ubi_wl_flush function to clear work queue for a lnum This patch modifies ubi_wl_flush to force the erasure of particular volume id / logical eraseblock number pairs. Previous functionality is preserved when passing UBI_ALL for both values. The locations where ubi_wl_flush were called are appropriately changed: ubi_leb_erase only flushes for the erased LEB, and ubi_create_volume forces only flushing for its volume id. External code can call this new feature via the new function ubi_flush() added to kapi.c, which simply passes through to ubi_wl_flush(). This was tested by disabling the call to do_work in ubi thread, which results in the work queue remaining unless explicitly called to remove. UBIFS was changed to call ubifs_leb_change 50 times for four different LEBs. Then the new function was called to clear the queue: passing wrong volume ids / lnum, correct ones, and finally UBI_ALL for both to ensure it was finally all cleard. The work queue was dumped each time and the selective removal of the particular LEB numbers was observed. Extra checks were enabled and ubifs's integck was also run. Finally, the drive was repeatedly filled and emptied to ensure that the queue was cleared normally. Artem: amended the patch. Signed-off-by: Joel Reardon <reardonj@inf.ethz.ch> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>	2012-05-21 11:34:41 +03:00
Artem Bityutskiy	e2986827d5	UBI: get rid of dbg_err This patch removes the 'dbg_err()' macro and we now use 'ubi_err' instead. The idea of 'dbg_err()' was to compile out some error message to make the binary a bit smaller - but I think it was a bad idea. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>	2012-05-20 20:26:00 +03:00
Richard Weinberger	b36a261e8c	UBI: Kill data type hint We do not need this feature and to our shame it even was not working and there was a bug found very recently. -- Artem Bityutskiy Without the data type hint UBI2 (fastmap) will be easier to implement. Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>	2012-05-20 20:25:59 +03:00
Artem Bityutskiy	327cf2922b	mtd: do not use mtd->sync directly This patch teaches 'mtd_sync()' to do nothing when the MTD driver does not have the '->sync()' method, which allows us to remove all direct 'mtd->sync' accesses. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2012-01-09 18:26:21 +00:00
Artem Bityutskiy	85f2f2a809	mtd: introduce mtd_sync interface Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2012-01-09 18:25:35 +00:00
Brian Norris	d57f40544a	mtd: utilize `mtd_is_*()' functions Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@intel.com>	2011-09-21 09:19:06 +03:00
Artem Bityutskiy	f43ec882b8	UBI: provide LEB offset information Provide the LEB offset information in the UBI device information data structure. This piece of information is required by UBIFS to find out what are the LEB offsets which are aligned to the max. write size. If LEB offset not aligned to max. write size, then UBIFS has to take this into account to write more optimally. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2011-03-08 10:12:48 +02:00
Artem Bityutskiy	30b542ef45	UBI: incorporate maximum write size Incorporate MTD write buffer size into UBI device information because UBIFS needs this field. UBI does not use it ATM, just provides to upper layers in 'struct ubi_device_info'. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2011-03-08 10:12:48 +02:00
Shinya Kuribayashi	3f5026222e	UBI: fix s/then/than/ typos Signed-off-by: Shinya Kuribayashi <shinya.kuribayashi.px@renesas.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2010-05-07 08:33:10 +03:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Artem Bityutskiy	b531b55a7b	UBI: add more checks to chdev open When opening UBI volumes by their character device names, make sure we are opening character devices, not block devices or any other inode type. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2010-01-12 13:19:15 +02:00
Corentin Chary	b571028418	UBI: Add ubi_open_volume_path Add an 'ubi_open_volume_path(path, mode)' function which works like 'open_bdev_exclusive(path, mode, ...)' where path is the special file representing the UBI volume, typically /dev/ubi0_0. This is needed to teach UBIFS being able to mount UBI character devices. [Comments and the patch were amended a bit by Artem] Signed-off-by: Corentin Chary <corentincj@iksaif.net> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2009-11-24 08:18:54 +02:00
Dmitry Pervushin	0e0ee1cc33	UBI: add notification API UBI volume notifications are intended to create the API to get clients notified about volume creation/deletion, renaming and re-sizing. A client can subscribe to these notifications using 'ubi_volume_register()' and cancel the subscription using 'ubi_volume_unregister()'. When UBI volumes change, a blocking notifier is called. Clients also can request "added" events on all volumes that existed before client subscribed to the notifications. If we use notifications instead of calling functions like 'ubi_gluebi_xxx()', we can make the MTD emulation layer to be more flexible: build it as a separate module and load/unload it on demand. [Artem: many cleanups, rework locking, add "updated" event, provide device/volume info in notifiers] Signed-off-by: Dmitry Pervushin <dpervushin@embeddedalley.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2009-06-02 13:53:35 +03:00
Artem Bityutskiy	e1cf7e6dd4	UBI: improve debugging messages Various minor improvements to the debugging messages which I found useful while hunting problems. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2009-05-18 12:28:25 +03:00
Coly Li	73ac36ea14	fix similar typos to successfull When I review ocfs2 code, find there are 2 typos to "successfull". After doing grep "successfull " in kernel tree, 22 typos found totally -- great minds always think alike :) This patch fixes all the similar typos. Thanks for Randy's ack and comments. Signed-off-by: Coly Li <coyli@suse.de> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Roland Dreier <rolandd@cisco.com> Cc: Jeremy Kerr <jk@ozlabs.org> Cc: Jeff Garzik <jeff@garzik.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Vlad Yasevich <vladislav.yasevich@hp.com> Cc: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:15 -08:00
Artem Bityutskiy	c8566350a3	UBI: fix and re-work debugging stuff Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-07-24 13:34:45 +03:00
Artem Bityutskiy	a5bf619041	UBI: add ubi_sync() interface To flush MTD device caches. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-07-24 13:32:56 +03:00
Kyungmin Park	cadb40ccc1	UBI: avoid unnecessary division operations UBI already checks that @min io size is the power of 2 at io_init. It is save to use bit operations then. Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-07-24 13:32:54 +03:00
Artem Bityutskiy	ae616e1be1	UBI: fix warnings drivers/mtd/ubi/cdev.c: In function ‘vol_cdev_read’: drivers/mtd/ubi/cdev.c:187: warning: unused variable ‘vol_id’ CC [M] drivers/mtd/ubi/kapi.o drivers/mtd/ubi/kapi.c: In function ‘ubi_leb_erase’: drivers/mtd/ubi/kapi.c:483: warning: unused variable ‘vol_id’ drivers/mtd/ubi/kapi.c: In function ‘ubi_leb_unmap’: drivers/mtd/ubi/kapi.c:544: warning: unused variable ‘vol_id’ drivers/mtd/ubi/kapi.c: In function ‘ubi_leb_map’: drivers/mtd/ubi/kapi.c:582: warning: unused variable ‘vol_id’ Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-01-25 16:41:24 +02:00
Artem Bityutskiy	783b273afa	UBI: use separate mutex for volumes checking Introduce a separate mutex which serializes volumes checking, because we cammot really use volumes_mutex - it cases reverse locking problems with mtd_tbl_mutex when gluebi is used - thanks to lockdep. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2007-12-26 19:15:17 +02:00
Artem Bityutskiy	e73f4459d9	UBI: add UBI devices reference counting This is one more step on the way to "removable" UBI devices. It adds reference counting for UBI devices. Every time a volume on this device is opened - the device's refcount is increased. It is also increased if someone is reading any sysfs file of this UBI device or of one of its volumes. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2007-12-26 19:15:17 +02:00
Artem Bityutskiy	d05c77a816	UBI: introduce volume refcounting Add ref_count field to UBI volumes and remove weired "vol->removed" field. This way things are better understandable and we do not have to do whold show_attr operation under spinlock. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2007-12-26 19:15:16 +02:00
Artem Bityutskiy	35ad5fb76c	UBI: fix and cleanup volume opening functions This patch fixes error codes of the functions - if the device number is out of range, -EINVAL should be returned. It also removes unneeded try_module_get call from the open by name function. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2007-12-26 19:15:15 +02:00
Artem Bityutskiy	450f872a8e	UBI: get device when opening volume When a volume is opened, get its kref via get_device() call. And put the reference when closing the volume. With this, we may have a bit saner volume delete. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2007-12-26 19:15:15 +02:00
Artem Bityutskiy	cae0a77125	UBI: tweak volumes locking Transform vtbl_mutex to volumes_mutex - this just makes code easier to understand. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2007-12-26 19:15:15 +02:00
Artem Bityutskiy	89b96b6929	UBI: improve internal interfaces Pass volume description object to the EBA function which makes more sense, and EBA function do not have to find the volume description object by volume ID. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2007-12-26 19:15:15 +02:00
Artem Bityutskiy	49dfc29928	UBI: remove redundant field Remove redundant ubi->major field - we have it in ubi->cdev.dev already. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2007-12-26 19:15:14 +02:00
Artem Bityutskiy	393852ecfe	UBI: add ubi_leb_map interface The idea of this interface belongs to Adrian Hunter. The interface is extremely useful when one has to have a guarantee that an LEB will contain all 0xFFs even in case of an unclean reboot. UBI does have an 'ubi_leb_erase()' call which may do this, but it is stupid and ineffecient, because it flushes whole queue. I should be re-worked to just be a pair of unmap, map calls. The user of the interfaci is UBIFS at the moment. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2007-12-26 19:15:14 +02:00
Jesper Juhl	0169b49d52	UBI: don't use array index before testing if it is negative I can't find anything guaranteeing that 'ubi_num' cannot be <0 in drivers/mtd/ubi/kapi.c::ubi_open_volume(), and in fact the code even tests for that and errors out if so. Unfortunately the test for "ubi_num < 0" happens after we've already used 'ubi_num' as an array index - bad thing to do if it is negative. This patch moves the test earlier in the function and then moves the indexing using that variable after the check. A bit safer :-) Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2007-10-14 13:10:20 +03:00
Artem Bityutskiy	503990ebb2	UBI: remove unneeded error checks Pointed to by viro. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2007-07-18 16:58:53 +03:00
Fernando Luis Vázquez Cao	2db61c95c0	UBI: cleanup usage of try_module_get The use of try_module_get(THIS_MODULE) in ubi_get_device_info does not offer real protection against unexpected driver unloads, since we could be preempted before try_modules_get gets executed. It is the caller who should manipulate the refcounts. Besides, ubi_get_device_info is an exported symbol which guarantees protection when accessed through symbol_get. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2007-07-18 16:58:45 +03:00
Artem Bityutskiy	4ab60a0d7c	UBI: do not let to read too much In case of static volumes it is prohibited to read more data then available. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2007-07-18 16:52:42 +03:00
Artem B. Bityutskiy	801c135ce7	UBI: Unsorted Block Images UBI (Latin: "where?") manages multiple logical volumes on a single flash device, specifically supporting NAND flash devices. UBI provides a flexible partitioning concept which still allows for wear-levelling across the whole flash device. In a sense, UBI may be compared to the Logical Volume Manager (LVM). Whereas LVM maps logical sector numbers to physical HDD sector numbers, UBI maps logical eraseblocks to physical eraseblocks. More information may be found at http://www.linux-mtd.infradead.org/doc/ubi.html Partitioning/Re-partitioning An UBI volume occupies a certain number of erase blocks. This is limited by a configured maximum volume size, which could also be viewed as the partition size. Each individual UBI volume's size can be changed independently of the other UBI volumes, provided that the sum of all volume sizes doesn't exceed a certain limit. UBI supports dynamic volumes and static volumes. Static volumes are read-only and their contents are protected by CRC check sums. Bad eraseblocks handling UBI transparently handles bad eraseblocks. When a physical eraseblock becomes bad, it is substituted by a good physical eraseblock, and the user does not even notice this. Scrubbing On a NAND flash bit flips can occur on any write operation, sometimes also on read. If bit flips persist on the device, at first they can still be corrected by ECC, but once they accumulate, correction will become impossible. Thus it is best to actively scrub the affected eraseblock, by first copying it to a free eraseblock and then erasing the original. The UBI layer performs this type of scrubbing under the covers, transparently to the UBI volume users. Erase Counts UBI maintains an erase count header per eraseblock. This frees higher-level layers (like file systems) from doing this and allows for centralized erase count management instead. The erase counts are used by the wear-levelling algorithm in the UBI layer. The algorithm itself is exchangeable. Booting from NAND For booting directly from NAND flash the hardware must at least be capable of fetching and executing a small portion of the NAND flash. Some NAND flash controllers have this kind of support. They usually limit the window to a few kilobytes in erase block 0. This "initial program loader" (IPL) must then contain sufficient logic to load and execute the next boot phase. Due to bad eraseblocks, which may be randomly scattered over the flash device, it is problematic to store the "secondary program loader" (SPL) statically. Also, due to bit-flips it may become corrupted over time. UBI allows to solve this problem gracefully by storing the SPL in a small static UBI volume. UBI volumes vs. static partitions UBI volumes are still very similar to static MTD partitions: * both consist of eraseblocks (logical eraseblocks in case of UBI volumes, and physical eraseblocks in case of static partitions; * both support three basic operations - read, write, erase. But UBI volumes have the following advantages over traditional static MTD partitions: * there are no eraseblock wear-leveling constraints in case of UBI volumes, so the user should not care about this; * there are no bit-flips and bad eraseblocks in case of UBI volumes. So, UBI volumes may be considered as flash devices with relaxed restrictions. Where can it be found? Documentation, kernel code and applications can be found in the MTD gits. What are the applications for? The applications help to create binary flash images for two purposes: pfi files (partial flash images) for in-system update of UBI volumes, and plain binary images, with or without OOB data in case of NAND, for a manufacturing step. Furthermore some tools are/and will be created that allow flash content analysis after a system has crashed.. Who did UBI? The original ideas, where UBI is based on, were developed by Andreas Arnez, Frank Haverkamp and Thomas Gleixner. Josh W. Boyer and some others were involved too. The implementation of the kernel layer was done by Artem B. Bityutskiy. The user-space applications and tools were written by Oliver Lohmann with contributions from Frank Haverkamp, Andreas Arnez, and Artem. Joern Engel contributed a patch which modifies JFFS2 so that it can be run on a UBI volume. Thomas Gleixner did modifications to the NAND layer. Alexander Schmidt made some testing work as well as core functionality improvements. Signed-off-by: Artem B. Bityutskiy <dedekind@linutronix.de> Signed-off-by: Frank Haverkamp <haver@vnet.ibm.com>	2007-04-27 14:23:33 +03:00

45 Commits