linux

Commit Graph

Author	SHA1	Message	Date
Paulo Alcantara	5c1acf3fe0	cifs: fix regression when mounting shares with prefix paths The commit `315db9a05b` ("cifs: fix leak in cifs_smb3_do_mount() ctx") revealed an existing bug when mounting shares that contain a prefix path or DFS links. cifs_setup_volume_info() requires the @devname to contain the full path (UNC + prefix) to update the fs context with the new UNC and prepath values, however we were passing only the UNC path (old_ctx->UNC) in @device thus discarding any prefix paths. Instead of concatenating both old_ctx->{UNC,prepath} and pass it in @devname, just keep the dup'ed values of UNC and prepath in cifs_sb->ctx after calling smb3_fs_context_dup(), and fix smb3_parse_devname() to correctly parse and not leak the new UNC and prefix paths. Cc: <stable@vger.kernel.org> # v5.11+ Fixes: `315db9a05b` ("cifs: fix leak in cifs_smb3_do_mount() ctx") Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Acked-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>	2021-05-04 11:52:56 -05:00
Rohith Surabattula	c3f207ab29	cifs: Deferred close for files When file is closed, SMB2 close request is not sent to server immediately and is deferred for acregmax defined interval. When file is reopened by same process for read or write, the file handle is reused if an oplock is held. When client receives a oplock/lease break, file is closed immediately if reference count is zero, else oplock is downgraded. Signed-off-by: Rohith Surabattula <rohiths@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2021-05-03 11:20:35 -05:00
Steve French	fee742b502	smb3.1.1: enable negotiating stronger encryption by default Now that stronger encryption (gcm256) has been more broadly tested, and confirmed to work with multiple servers (Windows and Azure for example), enable it by default. Although gcm256 is the second choice we offer (after gcm128 which should be faster), this change allows mounts to server which are configured to require the strongest encryption to work (without changing a module load parameter). Signed-off-by: Steve French <stfrench@microsoft.com> Suggested-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2021-04-27 23:18:40 -05:00
Steve French	b8d64f8ced	smb3: add rasize mount parameter to improve readahead performance In some cases readahead of more than the read size can help (to allow parallel i/o of read ahead which can improve performance). Ceph introduced a mount parameter "rasize" to allow controlling this. Add mount parameter "rasize" to allow control of amount of readahead requested of the server. If rasize not set, rasize defaults to negotiated rsize as before. Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2021-04-25 23:59:08 -05:00
David Disseldorp	315db9a05b	cifs: fix leak in cifs_smb3_do_mount() ctx cifs_smb3_do_mount() calls smb3_fs_context_dup() and then cifs_setup_volume_info(). The latter's subsequent smb3_parse_devname() call overwrites the cifs_sb->ctx->UNC string already dup'ed by smb3_fs_context_dup(), resulting in a leak. E.g. unreferenced object 0xffff888002980420 (size 32): comm "mount", pid 160, jiffies 4294892541 (age 30.416s) hex dump (first 32 bytes): 5c 5c 31 39 32 2e 31 36 38 2e 31 37 34 2e 31 30 \\192.168.174.10 34 5c 72 61 70 69 64 6f 2d 73 68 61 72 65 00 00 4\rapido-share.. backtrace: [<00000000069e12f6>] kstrdup+0x28/0x50 [<00000000b61f4032>] smb3_fs_context_dup+0x127/0x1d0 [cifs] [<00000000c6e3e3bf>] cifs_smb3_do_mount+0x77/0x660 [cifs] [<0000000063467a6b>] smb3_get_tree+0xdf/0x220 [cifs] [<00000000716f731e>] vfs_get_tree+0x1b/0x90 [<00000000491d3892>] path_mount+0x62a/0x910 [<0000000046b2e774>] do_mount+0x50/0x70 [<00000000ca7b64dd>] __x64_sys_mount+0x81/0xd0 [<00000000b5122496>] do_syscall_64+0x33/0x40 [<000000002dd397af>] entry_SYSCALL_64_after_hwframe+0x44/0xae This change is a bandaid until the cifs_setup_volume_info() TODO and error handling issues are resolved. Signed-off-by: David Disseldorp <ddiss@suse.de> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> CC: <stable@vger.kernel.org> # v5.11+ Signed-off-by: Steve French <stfrench@microsoft.com>	2021-04-25 16:28:24 -05:00
Ronnie Sahlberg	5e9c89d43f	cifs: Grab a reference for the dentry of the cached directory during the lifetime of the cache We need to hold both a reference for the root/superblock as well as the directory that we are caching. We need to drop these references before we call kill_anon_sb(). At this point, the root and the cached dentries are always the same but this will change once we start caching other directories as well. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2021-04-25 16:28:23 -05:00
Ronnie Sahlberg	269f67e1ff	cifs: store a pointer to the root dentry in cifs_sb_info once we have completed mounting the share And use this to only allow to take out a shared handle once the mount has completed and the sb becomes available. This will become important in follow up patches where we will start holding a reference to the directory dentry for the shared handle during the lifetime of the handle. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2021-04-25 16:28:23 -05:00
Aurelien Aptel	ec4e4862a9	cifs: remove old dead code While reviewing a patch clarifying locks and locking hierarchy I realized some locks were unused. This commit removes old data and code that isn't actually used anywhere, or hidden in ifdefs which cannot be enabled from the kernel config. * The uid/gid trees and associated locks are left-overs from when uid/sid mapping had an extra caching layer on top of the keyring and are now unused. See commit `faa65f07d2` ("cifs: simplify id_to_sid and sid_to_id mapping code") from 2012. * cifs_oplock_break_ops is a left-over from when slow_work was remplaced by regular workqueue and is now unused. See commit `9b64697246` ("cifs: use workqueue instead of slow-work") from 2010. * CIFSSMBSetAttrLegacy is SMB1 cruft dealing with some legacy NT4/Win9x behaviour. * Remove CONFIG_CIFS_DNOTIFY_EXPERIMENTAL left-overs. This was already partially removed in `392e1c5dc9` ("cifs: rename and clarify CIFS_ASYNC_OP and CIFS_NO_RESP") from 2019. Kill it completely. * Another candidate that was considered but spared is CONFIG_CIFS_NFSD_EXPORT which has an empty implementation and cannot be enabled by a config option (although it is listed but disabled with "BROKEN" as a dep). It's unclear whether this could even function today in its current form but it has it's own .c file and Kconfig entry which is a bit more involved to remove and might make a come back? Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2021-04-25 16:28:22 -05:00
Aurelien Aptel	b7fd0fa0ea	cifs: simplify SWN code with dummy funcs instead of ifdefs This commit doesn't change the logic of SWN. Add dummy implementation of SWN functions when SWN is disabled instead of using ifdef sections. The dummy functions get optimized out, this leads to clearer code and compile time type-checking regardless of config options with no runtime penalty. Leave the simple ifdefs section as-is. A single bitfield (bool foo:1) on its own will use up one int. Move tcon->use_witness out of ifdefs with the other tcon bitfields. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Samuel Cabrero <scabrero@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>	2021-04-25 16:28:22 -05:00
Maciek Borzecki	0fc9322ab5	cifs: escape spaces in share names Commit `653a5efb84` ("cifs: update super_operations to show_devname") introduced the display of devname for cifs mounts. However, when mounting a share which has a whitespace in the name, that exact share name is also displayed in mountinfo. Make sure that all whitespace is escaped. Signed-off-by: Maciek Borzecki <maciek.borzecki@gmail.com> CC: <stable@vger.kernel.org> # 5.11+ Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2021-04-07 21:30:27 -05:00
Paulo Alcantara	14302ee330	cifs: return proper error code in statfs(2) In cifs_statfs(), if server->ops->queryfs is not NULL, then we should use its return value rather than always returning 0. Instead, use rc variable as it is properly set to 0 in case there is no server->ops->queryfs. Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> CC: <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2021-03-08 21:23:11 -06:00
Linus Torvalds	c19798af2e	cifs/smb3 fixes including improvements to mode bit conversion when using cifsacl mount option, new mount options for controlling attribute caching, improvements to crediting and reconnect, improved debugging -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmA4bocACgkQiiy9cAdy T1HuMwv/bmZ53ilDwhCph/UKoLJm/bRyvCp+GBECtLLS/C/4qz5IBLpPPr2yhOyH gmkeCZZWhj0nzGAYxhVDAdBz9IPEA7bae503IaOuk5uXaCXC8htsq/Cd7qpJmHlf 5vCdBTmHBiUt02dUcZ9A3bm855xJLEINHH9YdJM157ysqLgttIibLQB0F/gJ49DR QIIdq7sZNJXcTgRsUzJZNnrWLDi2oVIoUlq5M6d8ypmZC0ArPNfrSafjW6h5rqpj UYBwtUDNwQiS0lgwR4mji4PCen0GGwMFtyVDOpdJLJq3fO995yse2BRk0BFHVH1i xfAskQjkxAHEcfzQC1cM4ouT/WYu8nHaLK1vp/1lVr93mo8KqSX+SW/bLsXfpQkm w//xMy94HdM2pgyM6J1pYnKPb7s/DG19RYPktQ5oYn0fR5qYlqALAmd02JRjO3xV cbjbmWXXzFFsFJc5MJmM6wVLJzRb4a50SN1W37aHNXWi8ktpsaNzz33LGw1pF2OT K6P2DqJe =RQo/ -----END PGP SIGNATURE----- Merge tag '5.12-smb3-part1' of git://git.samba.org/sfrench/cifs-2.6 Pull cifs updates from Steve French: - improvements to mode bit conversion, chmod and chown when using cifsacl mount option - two new mount options for controlling attribute caching - improvements to crediting and reconnect, improved debugging - reconnect fix - add SMB3.1.1 dialect to default dialects for vers=3 * tag '5.12-smb3-part1' of git://git.samba.org/sfrench/cifs-2.6: (27 commits) cifs: update internal version number cifs: use discard iterator to discard unneeded network data more efficiently cifs: introduce helper for finding referral server to improve DFS target resolution cifs: check all path components in resolved dfs target cifs: fix DFS failover cifs: fix nodfs mount option cifs: fix handling of escaped ',' in the password mount argument cifs: Add new parameter "acregmax" for distinct file and directory metadata timeout cifs: convert revalidate of directories to using directory metadata cache timeout cifs: Add new mount parameter "acdirmax" to allow caching directory metadata cifs: If a corrupted DACL is returned by the server, bail out. cifs: minor simplification to smb2_is_network_name_deleted TCON Reconnect during STATUS_NETWORK_NAME_DELETED cifs: cleanup a few le16 vs. le32 uses in cifsacl.c cifs: Change SIDs in ACEs while transferring file ownership. cifs: Retain old ACEs when converting between mode bits and ACL. cifs: Fix cifsacl ACE mask for group and others. cifs: clarify hostname vs ip address in /proc/fs/cifs/DebugData cifs: change confusing field serverName (to ip_addr) cifs: Fix inconsistent IS_ERR and PTR_ERR ...	2021-02-26 14:09:41 -08:00
Steve French	5780464614	cifs: Add new parameter "acregmax" for distinct file and directory metadata timeout The new optional mount parameter "acregmax" allows a different timeout for file metadata ("acdirmax" now allows controlling timeout for directory metadata). Setting "actimeo" still works as before, and changes timeout for both files and directories, but specifying "acregmax" or "acdirmax" allows overriding the default more granularly which can be a big performance benefit on some workloads. "acregmax" is already used by NFS as a mount parameter (albeit with a larger default and thus looser caching). Suggested-by: Tom Talpey <tom@talpey.com> Reviewed-By: Tom Talpey <tom@talpey.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2021-02-25 11:47:49 -06:00
Steve French	4c9f948142	cifs: Add new mount parameter "acdirmax" to allow caching directory metadata nfs and cifs on Linux currently have a mount parameter "actimeo" to control metadata (attribute) caching but cifs does not have additional mount parameters to allow distinguishing between caching directory metadata (e.g. needed to revalidate paths) and that for files. Add new mount parameter "acdirmax" to allow caching metadata for directories more loosely than file data. NFS adjusts metadata caching from acdirmin to acdirmax (and another two mount parms for files) but to reduce complexity, it is safer to just introduce the one mount parm to allow caching directories longer. The defaults for acdirmax and actimeo (for cifs.ko) are conservative, 1 second (NFS defaults acdirmax to 60 seconds). For many workloads, setting acdirmax to a higher value is safe and will improve performance. This patch leaves unchanged the default values for caching metadata for files and directories but gives the user more flexibility in adjusting them safely for their workload via the new mount parm. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-By: Tom Talpey <tom@talpey.com>	2021-02-25 11:47:42 -06:00
Linus Torvalds	7d6beb71da	idmapped-mounts-v5.12 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYCegywAKCRCRxhvAZXjc ouJ6AQDlf+7jCQlQdeKKoN9QDFfMzG1ooemat36EpRRTONaGuAD8D9A4sUsG4+5f 4IU5Lj9oY4DEmF8HenbWK2ZHsesL2Qg= =yPaw -----END PGP SIGNATURE----- Merge tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull idmapped mounts from Christian Brauner: "This introduces idmapped mounts which has been in the making for some time. Simply put, different mounts can expose the same file or directory with different ownership. This initial implementation comes with ports for fat, ext4 and with Christoph's port for xfs with more filesystems being actively worked on by independent people and maintainers. Idmapping mounts handle a wide range of long standing use-cases. Here are just a few: - Idmapped mounts make it possible to easily share files between multiple users or multiple machines especially in complex scenarios. For example, idmapped mounts will be used in the implementation of portable home directories in systemd-homed.service(8) where they allow users to move their home directory to an external storage device and use it on multiple computers where they are assigned different uids and gids. This effectively makes it possible to assign random uids and gids at login time. - It is possible to share files from the host with unprivileged containers without having to change ownership permanently through chown(2). - It is possible to idmap a container's rootfs and without having to mangle every file. For example, Chromebooks use it to share the user's Download folder with their unprivileged containers in their Linux subsystem. - It is possible to share files between containers with non-overlapping idmappings. - Filesystem that lack a proper concept of ownership such as fat can use idmapped mounts to implement discretionary access (DAC) permission checking. - They allow users to efficiently changing ownership on a per-mount basis without having to (recursively) chown(2) all files. In contrast to chown (2) changing ownership of large sets of files is instantenous with idmapped mounts. This is especially useful when ownership of a whole root filesystem of a virtual machine or container is changed. With idmapped mounts a single syscall mount_setattr syscall will be sufficient to change the ownership of all files. - Idmapped mounts always take the current ownership into account as idmappings specify what a given uid or gid is supposed to be mapped to. This contrasts with the chown(2) syscall which cannot by itself take the current ownership of the files it changes into account. It simply changes the ownership to the specified uid and gid. This is especially problematic when recursively chown(2)ing a large set of files which is commong with the aforementioned portable home directory and container and vm scenario. - Idmapped mounts allow to change ownership locally, restricting it to specific mounts, and temporarily as the ownership changes only apply as long as the mount exists. Several userspace projects have either already put up patches and pull-requests for this feature or will do so should you decide to pull this: - systemd: In a wide variety of scenarios but especially right away in their implementation of portable home directories. https://systemd.io/HOME_DIRECTORY/ - container runtimes: containerd, runC, LXD:To share data between host and unprivileged containers, unprivileged and privileged containers, etc. The pull request for idmapped mounts support in containerd, the default Kubernetes runtime is already up for quite a while now: https://github.com/containerd/containerd/pull/4734 - The virtio-fs developers and several users have expressed interest in using this feature with virtual machines once virtio-fs is ported. - ChromeOS: Sharing host-directories with unprivileged containers. I've tightly synced with all those projects and all of those listed here have also expressed their need/desire for this feature on the mailing list. For more info on how people use this there's a bunch of talks about this too. Here's just two recent ones: https://www.cncf.io/wp-content/uploads/2020/12/Rootless-Containers-in-Gitpod.pdf https://fosdem.org/2021/schedule/event/containers_idmap/ This comes with an extensive xfstests suite covering both ext4 and xfs: https://git.kernel.org/brauner/xfstests-dev/h/idmapped_mounts It covers truncation, creation, opening, xattrs, vfscaps, setid execution, setgid inheritance and more both with idmapped and non-idmapped mounts. It already helped to discover an unrelated xfs setgid inheritance bug which has since been fixed in mainline. It will be sent for inclusion with the xfstests project should you decide to merge this. In order to support per-mount idmappings vfsmounts are marked with user namespaces. The idmapping of the user namespace will be used to map the ids of vfs objects when they are accessed through that mount. By default all vfsmounts are marked with the initial user namespace. The initial user namespace is used to indicate that a mount is not idmapped. All operations behave as before and this is verified in the testsuite. Based on prior discussions we want to attach the whole user namespace and not just a dedicated idmapping struct. This allows us to reuse all the helpers that already exist for dealing with idmappings instead of introducing a whole new range of helpers. In addition, if we decide in the future that we are confident enough to enable unprivileged users to setup idmapped mounts the permission checking can take into account whether the caller is privileged in the user namespace the mount is currently marked with. The user namespace the mount will be marked with can be specified by passing a file descriptor refering to the user namespace as an argument to the new mount_setattr() syscall together with the new MOUNT_ATTR_IDMAP flag. The system call follows the openat2() pattern of extensibility. The following conditions must be met in order to create an idmapped mount: - The caller must currently have the CAP_SYS_ADMIN capability in the user namespace the underlying filesystem has been mounted in. - The underlying filesystem must support idmapped mounts. - The mount must not already be idmapped. This also implies that the idmapping of a mount cannot be altered once it has been idmapped. - The mount must be a detached/anonymous mount, i.e. it must have been created by calling open_tree() with the OPEN_TREE_CLONE flag and it must not already have been visible in the filesystem. The last two points guarantee easier semantics for userspace and the kernel and make the implementation significantly simpler. By default vfsmounts are marked with the initial user namespace and no behavioral or performance changes are observed. The manpage with a detailed description can be found here: `1d7b902e28` In order to support idmapped mounts, filesystems need to be changed and mark themselves with the FS_ALLOW_IDMAP flag in fs_flags. The patches to convert individual filesystem are not very large or complicated overall as can be seen from the included fat, ext4, and xfs ports. Patches for other filesystems are actively worked on and will be sent out separately. The xfstestsuite can be used to verify that port has been done correctly. The mount_setattr() syscall is motivated independent of the idmapped mounts patches and it's been around since July 2019. One of the most valuable features of the new mount api is the ability to perform mounts based on file descriptors only. Together with the lookup restrictions available in the openat2() RESOLVE_* flag namespace which we added in v5.6 this is the first time we are close to hardened and race-free (e.g. symlinks) mounting and path resolution. While userspace has started porting to the new mount api to mount proper filesystems and create new bind-mounts it is currently not possible to change mount options of an already existing bind mount in the new mount api since the mount_setattr() syscall is missing. With the addition of the mount_setattr() syscall we remove this last restriction and userspace can now fully port to the new mount api, covering every use-case the old mount api could. We also add the crucial ability to recursively change mount options for a whole mount tree, both removing and adding mount options at the same time. This syscall has been requested multiple times by various people and projects. There is a simple tool available at https://github.com/brauner/mount-idmapped that allows to create idmapped mounts so people can play with this patch series. I'll add support for the regular mount binary should you decide to pull this in the following weeks: Here's an example to a simple idmapped mount of another user's home directory: u1001@f2-vm:/$ sudo ./mount --idmap both:1000:1001:1 /home/ubuntu/ /mnt u1001@f2-vm:/$ ls -al /home/ubuntu/ total 28 drwxr-xr-x 2 ubuntu ubuntu 4096 Oct 28 22:07 . drwxr-xr-x 4 root root 4096 Oct 28 04:00 .. -rw------- 1 ubuntu ubuntu 3154 Oct 28 22:12 .bash_history -rw-r--r-- 1 ubuntu ubuntu 220 Feb 25 2020 .bash_logout -rw-r--r-- 1 ubuntu ubuntu 3771 Feb 25 2020 .bashrc -rw-r--r-- 1 ubuntu ubuntu 807 Feb 25 2020 .profile -rw-r--r-- 1 ubuntu ubuntu 0 Oct 16 16:11 .sudo_as_admin_successful -rw------- 1 ubuntu ubuntu 1144 Oct 28 00:43 .viminfo u1001@f2-vm:/$ ls -al /mnt/ total 28 drwxr-xr-x 2 u1001 u1001 4096 Oct 28 22:07 . drwxr-xr-x 29 root root 4096 Oct 28 22:01 .. -rw------- 1 u1001 u1001 3154 Oct 28 22:12 .bash_history -rw-r--r-- 1 u1001 u1001 220 Feb 25 2020 .bash_logout -rw-r--r-- 1 u1001 u1001 3771 Feb 25 2020 .bashrc -rw-r--r-- 1 u1001 u1001 807 Feb 25 2020 .profile -rw-r--r-- 1 u1001 u1001 0 Oct 16 16:11 .sudo_as_admin_successful -rw------- 1 u1001 u1001 1144 Oct 28 00:43 .viminfo u1001@f2-vm:/$ touch /mnt/my-file u1001@f2-vm:/$ setfacl -m u:1001:rwx /mnt/my-file u1001@f2-vm:/$ sudo setcap -n 1001 cap_net_raw+ep /mnt/my-file u1001@f2-vm:/$ ls -al /mnt/my-file -rw-rwxr--+ 1 u1001 u1001 0 Oct 28 22:14 /mnt/my-file u1001@f2-vm:/$ ls -al /home/ubuntu/my-file -rw-rwxr--+ 1 ubuntu ubuntu 0 Oct 28 22:14 /home/ubuntu/my-file u1001@f2-vm:/$ getfacl /mnt/my-file getfacl: Removing leading '/' from absolute path names # file: mnt/my-file # owner: u1001 # group: u1001 user::rw- user:u1001:rwx group::rw- mask::rwx other::r-- u1001@f2-vm:/$ getfacl /home/ubuntu/my-file getfacl: Removing leading '/' from absolute path names # file: home/ubuntu/my-file # owner: ubuntu # group: ubuntu user::rw- user:ubuntu:rwx group::rw- mask::rwx other::r--" * tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: (41 commits) xfs: remove the possibly unused mp variable in xfs_file_compat_ioctl xfs: support idmapped mounts ext4: support idmapped mounts fat: handle idmapped mounts tests: add mount_setattr() selftests fs: introduce MOUNT_ATTR_IDMAP fs: add mount_setattr() fs: add attr_flags_to_mnt_flags helper fs: split out functions to hold writers namespace: only take read lock in do_reconfigure_mnt() mount: make {lock,unlock}_mount_hash() static namespace: take lock_mount_hash() directly when changing flags nfs: do not export idmapped mounts overlayfs: do not mount on top of idmapped mounts ecryptfs: do not mount on top of idmapped mounts ima: handle idmapped mounts apparmor: handle idmapped mounts fs: make helpers idmap mount aware exec: handle idmapped mounts would_dump: handle idmapped mounts ...	2021-02-23 13:39:45 -08:00
Shyam Prasad N	6d82c27ae5	cifs: Identify a connection by a conn_id. Introduced a new field conn_id in TCP_Server_Info structure. This is a non-persistent unique identifier maintained by the client for a connection to a file server. For this, a global counter named tcpSesNextId is maintained. On allocating a new TCP_Server_Info, this counter is incremented and assigned. Changed the dynamic tracepoints related to reconnects and crediting to be more informative (with conn_id printed). Debugging a crediting issue helped me understand the important things to print here. Always call dynamic tracepoints outside the scope of spinlocks. To do this, copy out the credits and in_flight fields of the server struct before dropping the lock. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2021-02-16 15:48:02 -06:00
Ronnie Sahlberg	af1a3d2ba9	cifs: In the new mount api we get the full devname as source= so we no longer need to handle or parse the UNC= and prefixpath= options that mount.cifs are generating. This also fixes a bug in the mount command option where the devname would be truncated into just //server/share because we were looking at the truncated UNC value and not the full path. I.e. in the mount command output the devive //server/share/path would show up as just //server/share Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Shyam Prasad N <nspmangalore@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2021-02-11 10:58:08 -06:00
Ronnie Sahlberg	0d4873f9aa	cifs: fix dfs domain referrals The new mount API requires additional changes to how DFS is handled. Additional testing of DFS uncovered problems with domain based DFS referrals (a follow on patch addresses DFS links) which this patch addresses. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>	2021-01-28 21:40:43 -06:00
Christian Brauner	549c729771	fs: make helpers idmap mount aware Extend some inode methods with an additional user namespace argument. A filesystem that is aware of idmapped mounts will receive the user namespace the mount has been marked with. This can be used for additional permission checking and also to enable filesystems to translate between uids and gids if they need to. We have implemented all relevant helpers in earlier patches. As requested we simply extend the exisiting inode method instead of introducing new ones. This is a little more code churn but it's mostly mechanical and doesnt't leave us with additional inode methods. Link: https://lore.kernel.org/r/20210121131959.646623-25-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2021-01-24 14:27:20 +01:00
Christian Brauner	47291baa8d	namei: make permission helpers idmapped mount aware The two helpers inode_permission() and generic_permission() are used by the vfs to perform basic permission checking by verifying that the caller is privileged over an inode. In order to handle idmapped mounts we extend the two helpers with an additional user namespace argument. On idmapped mounts the two helpers will make sure to map the inode according to the mount's user namespace and then peform identical permission checks to inode_permission() and generic_permission(). If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-6-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Acked-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2021-01-24 14:27:16 +01:00
Ronnie Sahlberg	6cf5abbfa8	cifs: fix use after free in cifs_smb3_do_mount() Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-12-15 16:55:57 -06:00
Steve French	0c2b5f7ce5	cifs: fix rsize/wsize to be negotiated values Also make sure these are displayed in /proc/mounts Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2020-12-15 15:13:59 -06:00
Steve French	653a5efb84	cifs: update super_operations to show_devname This is needed so that we display the correct //server/share vs \\server\share in /proc/mounts for the device name (in the new mount API). Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>	2020-12-15 15:12:51 -06:00
Ronnie Sahlberg	51acd208bd	cifs: remove ctx argument from cifs_setup_cifs_sb Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-12-14 09:26:30 -06:00
Ronnie Sahlberg	7c7ee628f8	cifs: uncomplicate printing the iocharset parameter There is no need to load the default nls to check if the iocharset argument was specified or not since we have it in cifs_sb->ctx Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-12-14 09:26:30 -06:00
Ronnie Sahlberg	9ccecae8d1	cifs: we do not allow changing username/password/unc/... during remount Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-12-14 09:26:30 -06:00
Ronnie Sahlberg	522aa3b575	cifs: move [brw]size from cifs_sb to cifs_sb->ctx Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-12-14 09:26:30 -06:00
Ronnie Sahlberg	c741cba2cd	cifs: move cifs_cleanup_volume_info[_content] to fs_context.c and rename it to smb3_cleanup_fs_context[_content] Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-12-14 09:26:30 -06:00
Ronnie Sahlberg	af1e40d9ac	cifs: remove actimeo from cifs_sb Can now be accessed via the ctx Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-12-14 09:16:23 -06:00
Ronnie Sahlberg	8401e93678	cifs: remove [gu]id/backup[gu]id/file_mode/dir_mode from cifs_sb We can already access these from cifs_sb->ctx so we no longer need a local copy in cifs_sb. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-12-14 09:16:23 -06:00
Samuel Cabrero	0ac4e2919a	cifs: add witness mount option and data structs Add 'witness' mount option to register for witness notifications. Signed-off-by: Samuel Cabrero <scabrero@suse.de> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-12-14 09:16:22 -06:00
Samuel Cabrero	06f08dab3c	cifs: Register generic netlink family Register a new generic netlink family to talk to the witness service userspace daemon. Signed-off-by: Samuel Cabrero <scabrero@suse.de> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-12-14 09:16:22 -06:00
Ronnie Sahlberg	a2a52a8a36	cifs: get rid of cifs_sb->mountdata as we now have a full smb3_fs_context as part of the cifs superblock we no longer need a local copy of the mount options and can just reference the copy in the smb3_fs_context. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-12-13 19:12:07 -06:00
Ronnie Sahlberg	d17abdf756	cifs: add an smb3_fs_context to cifs_sb and populate it during mount in cifs_smb3_do_mount() Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-12-13 19:12:07 -06:00
Ronnie Sahlberg	24e0a1eff9	cifs: switch to new mount api See Documentation/filesystems/mount_api.rst for details on new mount API Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-12-13 19:12:07 -06:00
Ronnie Sahlberg	3fa1c6d1b8	cifs: rename smb_vol as smb3_fs_context and move it to fs_context.h Harmonize and change all such variables to 'ctx', where possible. No changes to actual logic. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-12-13 19:12:07 -06:00
Steve French	29e2792304	smb3.1.1: add new module load parm enable_gcm_256 Add new module load parameter enable_gcm_256. If set, then add AES-256-GCM (strongest encryption type) to the list of encryption types requested. Put it in the list as the second choice (since AES-128-GCM is faster and much more broadly supported by SMB3 servers). To make this stronger encryption type, GCM-256, required (the first and only choice, you would use module parameter "require_gcm_256." Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-10-15 23:58:15 -05:00
Steve French	fbfd0b46af	smb3.1.1: add new module load parm require_gcm_256 Add new module load parameter require_gcm_256. If set, then only request AES-256-GCM (strongest encryption type). Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-10-15 23:58:15 -05:00
Steve French	7866c177a0	smb3: fix typo in mount options displayed in /proc/mounts Missing the final 's' in "max_channels" mount option when displayed in /proc/mounts (or by mount command) CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Shyam Prasad N <nspmangalore@gmail.com>	2020-06-10 12:05:15 -05:00
Steve French	82e9367c43	smb3: Add new parm "nodelete" In order to handle workloads where it is important to make sure that a buggy app did not delete content on the drive, the new mount option "nodelete" allows standard permission checks on the server to work, but prevents on the client any attempts to unlink a file or delete a directory on that mount point. This can be helpful when running a little understood app on a network mount that contains important content that should not be deleted. Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>	2020-06-01 00:10:18 -05:00
Steve French	4e8aea30f7	smb3: enable swap on SMB3 mounts Add experimental support for allowing a swap file to be on an SMB3 mount. There are use cases where swapping over a secure network filesystem is preferable. In some cases there are no local block devices large enough, and network block devices can be hard to setup and secure. And in some cases there are no local block devices at all (e.g. with the recent addition of remote boot over SMB3 mounts). There are various enhancements that can be added later e.g.: - doing a mandatory byte range lock over the swapfile (until the Linux VFS is modified to notify the file system that an open is for a swapfile, when the file can be opened "DENY_ALL" to prevent others from opening it). - pinning more buffers in the underlying transport to minimize memory allocations in the TCP stack under the fs - documenting how to create ACLs (on the server) to secure the swapfile (or adding additional tools to cifs-utils to make it easier) Signed-off-by: Steve French <stfrench@microsoft.com> Acked-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2020-04-10 13:32:32 -05:00
Steve French	c7e9f78f7b	cifs: do d_move in rename See commit `349457ccf2` "Allow file systems to manually d_move() inside of ->rename()" Lessens possibility of race conditions in rename Signed-off-by: Steve French <stfrench@microsoft.com>	2020-03-22 22:49:09 -05:00
Steve French	ec57010acd	cifs: add missing mount option to /proc/mounts We were not displaying the mount option "signloosely" in /proc/mounts for cifs mounts which some users found confusing recently Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com>	2020-02-24 14:20:38 -06:00
Petr Pavlu	3f6166aaf1	cifs: fix mount option display for sec=krb5i Fix display for sec=krb5i which was wrongly interleaved by cruid, resulting in string "sec=krb5,cruid=<...>i" instead of "sec=krb5i,cruid=<...>". Fixes: `96281b9e46` ("smb3: for kerberos mounts display the credential uid used") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-02-10 08:32:30 -06:00
Amir Goldstein	0f060936e4	SMB3: Backup intent flag missing from some more ops When "backup intent" is requested on the mount (e.g. backupuid or backupgid mount options), the corresponding flag was missing from some of the operations. Change all operations to use the macro cifs_create_options() to set the backup intent flag if needed. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-02-03 16:12:47 -06:00
Linus Torvalds	0aecba6173	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs d_inode/d_flags memory ordering fixes from Al Viro: "Fallout from tree-wide audit for ->d_inode/->d_flags barriers use. Basically, the problem is that negative pinned dentries require careful treatment - unless ->d_lock is locked or parent is held at least shared, another thread can make them positive right under us. Most of the uses turned out to be safe - the main surprises as far as filesystems are concerned were - race in dget_parent() fastpath, that might end up with the caller observing the returned dentry _negative_, due to insufficient barriers. It is positive in memory, but we could end up seeing the wrong value of ->d_inode in CPU cache. Fixed. - manual checks that result of lookup_one_len_unlocked() is positive (and rejection of negatives). Again, insufficient barriers (we might end up with inconsistent observed values of ->d_inode and ->d_flags). Fixed by switching to a new primitive that does the checks itself and returns ERR_PTR(-ENOENT) instead of a negative dentry. That way we get rid of boilerplate converting negatives into ERR_PTR(-ENOENT) in the callers and have a single place to deal with the barrier-related mess - inside fs/namei.c rather than in every caller out there. The guts of pathname resolution do need to be careful - the race found by Ritesh is real, as well as several similar races. Fortunately, it turns out that we can take care of that with fairly local changes in there. The tree-wide audit had not been fun, and I hate the idea of repeating it. I think the right approach would be to annotate the places where we are _not_ guaranteed ->d_inode/->d_flags stability and have sparse catch regressions. But I'm still not sure what would be the least invasive way of doing that and it's clearly the next cycle fodder" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs/namei.c: fix missing barriers when checking positivity fix dget_parent() fastpath race new helper: lookup_positive_unlocked() fs/namei.c: pull positivity check into follow_managed()	2019-12-06 09:06:58 -08:00
Linus Torvalds	937d6eefc7	Here's the main documentation changes for 5.5: - Various kerneldoc script enhancements. - More RST conversions; those are slowing down as we run out of things to convert, but we're a ways from done still. - Dan's "maintainer profile entry" work landed at last. Now we just need to get maintainers to fill in the profiles... - A reworking of the parallel build setup to work better with a variety of systems (and to not take over huge systems entirely in particular). - The MAINTAINERS file is now converted to RST during the build. Hopefully nobody ever tries to print this thing, or they will need to load a lot of paper. - A script and documentation making it easy for maintainers to add Link: tags at commit time. Also included is the removal of a bunch of spurious CR characters. -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAl3j5B0PHGNvcmJldEBs d24ubmV0AAoJEBdDWhNsDH5YtBcH/jIN2cO8/0YW2rjVT+1G6ytSdFUKx5WJ/lpf 5uBeCvuCeYhtCB6+BgnXvjykJ7jDW11/NJNjWqz/gsvD5l5FJK1rXarI/oz2Klyi kcPtDmBF/ki4wz9qXzEpa0vg8LXdjeys50S1vE75qCzxZoPP7YjuRbPnLrlIJukv JbDVi4p9kxgeHfRB4+BHOe5rFwA3mMmaxKNIX34Y+UUO2KZ0g/yUi1bAaQwQAdt+ PsORmkVQ8Puh3K9xRIr7dYlcWBlBiPqzYdvDgTVxSjrxdK6wjYjSgVk2VjC5MBUN mTSTWgyfsIcD/76/s8tq7ZRl2fw+SkCSkFo79Rb/hJwDTb7Vnng= =LPBr -----END PGP SIGNATURE----- Merge tag 'docs-5.5a' of git://git.lwn.net/linux Pull Documentation updates from Jonathan Corbet: "Here are the main documentation changes for 5.5: - Various kerneldoc script enhancements. - More RST conversions; those are slowing down as we run out of things to convert, but we're a ways from done still. - Dan's "maintainer profile entry" work landed at last. Now we just need to get maintainers to fill in the profiles... - A reworking of the parallel build setup to work better with a variety of systems (and to not take over huge systems entirely in particular). - The MAINTAINERS file is now converted to RST during the build. Hopefully nobody ever tries to print this thing, or they will need to load a lot of paper. - A script and documentation making it easy for maintainers to add Link: tags at commit time. Also included is the removal of a bunch of spurious CR characters" * tag 'docs-5.5a' of git://git.lwn.net/linux: (91 commits) docs: remove a bunch of stray CRs docs: fix up the maintainer profile document libnvdimm, MAINTAINERS: Maintainer Entry Profile Maintainer Handbook: Maintainer Entry Profile MAINTAINERS: Reclaim the P: tag for Maintainer Entry Profile docs, parallelism: Rearrange how jobserver reservations are made docs, parallelism: Do not leak blocking mode to other readers docs, parallelism: Fix failure path and add comment Documentation: Remove bootmem_debug from kernel-parameters.txt Documentation: security: core.rst: fix warnings Documentation/process/howto/kokr: Update for 4.x -> 5.x versioning Documentation/translation: Use Korean for Korean translation title docs/memory-barriers.txt: Remove remaining references to mmiowb() docs/memory-barriers.txt/kokr: Update I/O section to be clearer about CPU vs thread docs/memory-barriers.txt/kokr: Fix style, spacing and grammar in I/O section Documentation/kokr: Kill all references to mmiowb() docs/memory-barriers.txt/kokr: Rewrite "KERNEL I/O BARRIER EFFECTS" section docs: Add initial documentation for devfreq Documentation: Document how to get links with git am docs: Add request_irq() documentation ...	2019-12-02 11:51:02 -08:00
Ronnie Sahlberg	32546a9586	cifs: move cifsFileInfo_put logic into a work-queue This patch moves the final part of the cifsFileInfo_put() logic where we need a write lock on lock_sem to be processed in a separate thread that holds no other locks. This is to prevent deadlocks like the one below: > there are 6 processes looping to while trying to down_write > cinode->lock_sem, 5 of them from _cifsFileInfo_put, and one from > cifs_new_fileinfo > > and there are 5 other processes which are blocked, several of them > waiting on either PG_writeback or PG_locked (which are both set), all > for the same page of the file > > 2 inode_lock() (inode->i_rwsem) for the file > 1 wait_on_page_writeback() for the page > 1 down_read(inode->i_rwsem) for the inode of the directory > 1 inode_lock()(inode->i_rwsem) for the inode of the directory > 1 __lock_page > > > so processes are blocked waiting on: > page flags PG_locked and PG_writeback for one specific page > inode->i_rwsem for the directory > inode->i_rwsem for the file > cifsInodeInflock_sem > > > > here are the more gory details (let me know if I need to provide > anything more/better): > > [0 00:48:22.765] [UN] PID: 8863 TASK: ffff8c691547c5c0 CPU: 3 > COMMAND: "reopen_file" > #0 [ffff9965007e3ba8] __schedule at ffffffff9b6e6095 > #1 [ffff9965007e3c38] schedule at ffffffff9b6e64df > #2 [ffff9965007e3c48] rwsem_down_write_slowpath at ffffffff9af283d7 > #3 [ffff9965007e3cb8] legitimize_path at ffffffff9b0f975d > #4 [ffff9965007e3d08] path_openat at ffffffff9b0fe55d > #5 [ffff9965007e3dd8] do_filp_open at ffffffff9b100a33 > #6 [ffff9965007e3ee0] do_sys_open at ffffffff9b0eb2d6 > #7 [ffff9965007e3f38] do_syscall_64 at ffffffff9ae04315 > * (I think legitimize_path is bogus) > > in path_openat > } else { > const char *s = path_init(nd, flags); > while (!(error = link_path_walk(s, nd)) && > (error = do_last(nd, file, op)) > 0) { <<<< > > do_last: > if (open_flag & O_CREAT) > inode_lock(dir->d_inode); <<<< > else > so it's trying to take inode->i_rwsem for the directory > > DENTRY INODE SUPERBLK TYPE PATH > ffff8c68bb8e79c0 ffff8c691158ef20 ffff8c6915bf9000 DIR /mnt/vm1_smb/ > inode.i_rwsem is ffff8c691158efc0 > > <struct rw_semaphore 0xffff8c691158efc0>: > owner: <struct task_struct 0xffff8c6914275d00> (UN - 8856 - > reopen_file), counter: 0x0000000000000003 > waitlist: 2 > 0xffff9965007e3c90 8863 reopen_file UN 0 1:29:22.926 > RWSEM_WAITING_FOR_WRITE > 0xffff996500393e00 9802 ls UN 0 1:17:26.700 > RWSEM_WAITING_FOR_READ > > > the owner of the inode.i_rwsem of the directory is: > > [0 00:00:00.109] [UN] PID: 8856 TASK: ffff8c6914275d00 CPU: 3 > COMMAND: "reopen_file" > #0 [ffff99650065b828] __schedule at ffffffff9b6e6095 > #1 [ffff99650065b8b8] schedule at ffffffff9b6e64df > #2 [ffff99650065b8c8] schedule_timeout at ffffffff9b6e9f89 > #3 [ffff99650065b940] msleep at ffffffff9af573a9 > #4 [ffff99650065b948] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs] > #5 [ffff99650065ba38] cifs_writepage_locked at ffffffffc0a0b8f3 [cifs] > #6 [ffff99650065bab0] cifs_launder_page at ffffffffc0a0bb72 [cifs] > #7 [ffff99650065bb30] invalidate_inode_pages2_range at ffffffff9b04d4bd > #8 [ffff99650065bcb8] cifs_invalidate_mapping at ffffffffc0a11339 [cifs] > #9 [ffff99650065bcd0] cifs_revalidate_mapping at ffffffffc0a1139a [cifs] > #10 [ffff99650065bcf0] cifs_d_revalidate at ffffffffc0a014f6 [cifs] > #11 [ffff99650065bd08] path_openat at ffffffff9b0fe7f7 > #12 [ffff99650065bdd8] do_filp_open at ffffffff9b100a33 > #13 [ffff99650065bee0] do_sys_open at ffffffff9b0eb2d6 > #14 [ffff99650065bf38] do_syscall_64 at ffffffff9ae04315 > > cifs_launder_page is for page 0xffffd1e2c07d2480 > > crash> page.index,mapping,flags 0xffffd1e2c07d2480 > index = 0x8 > mapping = 0xffff8c68f3cd0db0 > flags = 0xfffffc0008095 > > PAGE-FLAG BIT VALUE > PG_locked 0 0000001 > PG_uptodate 2 0000004 > PG_lru 4 0000010 > PG_waiters 7 0000080 > PG_writeback 15 0008000 > > > inode is ffff8c68f3cd0c40 > inode.i_rwsem is ffff8c68f3cd0ce0 > DENTRY INODE SUPERBLK TYPE PATH > ffff8c68a1f1b480 ffff8c68f3cd0c40 ffff8c6915bf9000 REG > /mnt/vm1_smb/testfile.8853 > > > this process holds the inode->i_rwsem for the parent directory, is > laundering a page attached to the inode of the file it's opening, and in > _cifsFileInfo_put is trying to down_write the cifsInodeInflock_sem > for the file itself. > > > <struct rw_semaphore 0xffff8c68f3cd0ce0>: > owner: <struct task_struct 0xffff8c6914272e80> (UN - 8854 - > reopen_file), counter: 0x0000000000000003 > waitlist: 1 > 0xffff9965005dfd80 8855 reopen_file UN 0 1:29:22.912 > RWSEM_WAITING_FOR_WRITE > > this is the inode.i_rwsem for the file > > the owner: > > [0 00:48:22.739] [UN] PID: 8854 TASK: ffff8c6914272e80 CPU: 2 > COMMAND: "reopen_file" > #0 [ffff99650054fb38] __schedule at ffffffff9b6e6095 > #1 [ffff99650054fbc8] schedule at ffffffff9b6e64df > #2 [ffff99650054fbd8] io_schedule at ffffffff9b6e68e2 > #3 [ffff99650054fbe8] __lock_page at ffffffff9b03c56f > #4 [ffff99650054fc80] pagecache_get_page at ffffffff9b03dcdf > #5 [ffff99650054fcc0] grab_cache_page_write_begin at ffffffff9b03ef4c > #6 [ffff99650054fcd0] cifs_write_begin at ffffffffc0a064ec [cifs] > #7 [ffff99650054fd30] generic_perform_write at ffffffff9b03bba4 > #8 [ffff99650054fda8] __generic_file_write_iter at ffffffff9b04060a > #9 [ffff99650054fdf0] cifs_strict_writev.cold.70 at ffffffffc0a4469b [cifs] > #10 [ffff99650054fe48] new_sync_write at ffffffff9b0ec1dd > #11 [ffff99650054fed0] vfs_write at ffffffff9b0eed35 > #12 [ffff99650054ff00] ksys_write at ffffffff9b0eefd9 > #13 [ffff99650054ff38] do_syscall_64 at ffffffff9ae04315 > > the process holds the inode->i_rwsem for the file to which it's writing, > and is trying to __lock_page for the same page as in the other processes > > > the other tasks: > [0 00:00:00.028] [UN] PID: 8859 TASK: ffff8c6915479740 CPU: 2 > COMMAND: "reopen_file" > #0 [ffff9965007b39d8] __schedule at ffffffff9b6e6095 > #1 [ffff9965007b3a68] schedule at ffffffff9b6e64df > #2 [ffff9965007b3a78] schedule_timeout at ffffffff9b6e9f89 > #3 [ffff9965007b3af0] msleep at ffffffff9af573a9 > #4 [ffff9965007b3af8] cifs_new_fileinfo.cold.61 at ffffffffc0a42a07 [cifs] > #5 [ffff9965007b3b78] cifs_open at ffffffffc0a0709d [cifs] > #6 [ffff9965007b3cd8] do_dentry_open at ffffffff9b0e9b7a > #7 [ffff9965007b3d08] path_openat at ffffffff9b0fe34f > #8 [ffff9965007b3dd8] do_filp_open at ffffffff9b100a33 > #9 [ffff9965007b3ee0] do_sys_open at ffffffff9b0eb2d6 > #10 [ffff9965007b3f38] do_syscall_64 at ffffffff9ae04315 > > this is opening the file, and is trying to down_write cinode->lock_sem > > > [0 00:00:00.041] [UN] PID: 8860 TASK: ffff8c691547ae80 CPU: 2 > COMMAND: "reopen_file" > [0 00:00:00.057] [UN] PID: 8861 TASK: ffff8c6915478000 CPU: 3 > COMMAND: "reopen_file" > [0 00:00:00.059] [UN] PID: 8858 TASK: ffff8c6914271740 CPU: 2 > COMMAND: "reopen_file" > [0 00:00:00.109] [UN] PID: 8862 TASK: ffff8c691547dd00 CPU: 6 > COMMAND: "reopen_file" > #0 [ffff9965007c3c78] __schedule at ffffffff9b6e6095 > #1 [ffff9965007c3d08] schedule at ffffffff9b6e64df > #2 [ffff9965007c3d18] schedule_timeout at ffffffff9b6e9f89 > #3 [ffff9965007c3d90] msleep at ffffffff9af573a9 > #4 [ffff9965007c3d98] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs] > #5 [ffff9965007c3e88] cifs_close at ffffffffc0a07aaf [cifs] > #6 [ffff9965007c3ea0] __fput at ffffffff9b0efa6e > #7 [ffff9965007c3ee8] task_work_run at ffffffff9aef1614 > #8 [ffff9965007c3f20] exit_to_usermode_loop at ffffffff9ae03d6f > #9 [ffff9965007c3f38] do_syscall_64 at ffffffff9ae0444c > > closing the file, and trying to down_write cifsi->lock_sem > > > [0 00:48:22.839] [UN] PID: 8857 TASK: ffff8c6914270000 CPU: 7 > COMMAND: "reopen_file" > #0 [ffff9965006a7cc8] __schedule at ffffffff9b6e6095 > #1 [ffff9965006a7d58] schedule at ffffffff9b6e64df > #2 [ffff9965006a7d68] io_schedule at ffffffff9b6e68e2 > #3 [ffff9965006a7d78] wait_on_page_bit at ffffffff9b03cac6 > #4 [ffff9965006a7e10] __filemap_fdatawait_range at ffffffff9b03b028 > #5 [ffff9965006a7ed8] filemap_write_and_wait at ffffffff9b040165 > #6 [ffff9965006a7ef0] cifs_flush at ffffffffc0a0c2fa [cifs] > #7 [ffff9965006a7f10] filp_close at ffffffff9b0e93f1 > #8 [ffff9965006a7f30] __x64_sys_close at ffffffff9b0e9a0e > #9 [ffff9965006a7f38] do_syscall_64 at ffffffff9ae04315 > > in __filemap_fdatawait_range > wait_on_page_writeback(page); > for the same page of the file > > > > [0 00:48:22.718] [UN] PID: 8855 TASK: ffff8c69142745c0 CPU: 7 > COMMAND: "reopen_file" > #0 [ffff9965005dfc98] __schedule at ffffffff9b6e6095 > #1 [ffff9965005dfd28] schedule at ffffffff9b6e64df > #2 [ffff9965005dfd38] rwsem_down_write_slowpath at ffffffff9af283d7 > #3 [ffff9965005dfdf0] cifs_strict_writev at ffffffffc0a0c40a [cifs] > #4 [ffff9965005dfe48] new_sync_write at ffffffff9b0ec1dd > #5 [ffff9965005dfed0] vfs_write at ffffffff9b0eed35 > #6 [ffff9965005dff00] ksys_write at ffffffff9b0eefd9 > #7 [ffff9965005dff38] do_syscall_64 at ffffffff9ae04315 > > inode_lock(inode); > > > and one 'ls' later on, to see whether the rest of the mount is available > (the test file is in the root, so we get blocked up on the directory > ->i_rwsem), so the entire mount is unavailable > > [0 00:36:26.473] [UN] PID: 9802 TASK: ffff8c691436ae80 CPU: 4 > COMMAND: "ls" > #0 [ffff996500393d28] __schedule at ffffffff9b6e6095 > #1 [ffff996500393db8] schedule at ffffffff9b6e64df > #2 [ffff996500393dc8] rwsem_down_read_slowpath at ffffffff9b6e9421 > #3 [ffff996500393e78] down_read_killable at ffffffff9b6e95e2 > #4 [ffff996500393e88] iterate_dir at ffffffff9b103c56 > #5 [ffff996500393ec8] ksys_getdents64 at ffffffff9b104b0c > #6 [ffff996500393f30] __x64_sys_getdents64 at ffffffff9b104bb6 > #7 [ffff996500393f38] do_syscall_64 at ffffffff9ae04315 > > in iterate_dir: > if (shared) > res = down_read_killable(&inode->i_rwsem); <<<< > else > res = down_write_killable(&inode->i_rwsem); > Reported-by: Frank Sorenson <sorenson@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2019-11-25 01:17:12 -06:00
Aurelien Aptel	bcc8880115	cifs: add multichannel mount options and data structs adds: - [no]multichannel to enable/disable multichannel - max_channels=N to control how many channels to create these options are then stored in the volume struct. - store channels and max_channels in cifs_ses Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2019-11-25 01:16:30 -06:00
Ronnie Sahlberg	3591bb83ee	cifs: don't use 'pre:' for MODULE_SOFTDEP It can cause to fail with modprobe: FATAL: Module <module> is builtin. RHBZ: 1767094 Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2019-11-25 01:16:30 -06:00

1 2 3 4 5 ...

531 Commits