cifs: ignore cached share root handle closing errors
Fix tcon use-after-free and NULL ptr deref.
Customer system crashes with the following kernel log:
[462233.169868] CIFS VFS: Cancelling wait for mid 4894753 cmd: 14 => a QUERY DIR
[462233.228045] CIFS VFS: cifs_put_smb_ses: Session Logoff failure rc=-4
[462233.305922] CIFS VFS: cifs_put_smb_ses: Session Logoff failure rc=-4
[462233.306205] CIFS VFS: cifs_put_smb_ses: Session Logoff failure rc=-4
[462233.347060] CIFS VFS: cifs_put_smb_ses: Session Logoff failure rc=-4
[462233.347107] CIFS VFS: Close unmatched open
[462233.347113] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
...
[exception RIP: cifs_put_tcon+0xa0] (this is doing tcon->ses->server)
#6 [...] smb2_cancelled_close_fid at ... [cifs]
#7 [...] process_one_work at ...
#8 [...] worker_thread at ...
#9 [...] kthread at ...
The most likely explanation we have is:
* When we put the last reference of a tcon (refcount=0), we close the
cached share root handle.
* If closing a handle is interrupted, SMB2_close() will
queue a SMB2_close() in a work thread.
* The queued object keeps a tcon ref so we bump the tcon
refcount, jumping from 0 to 1.
* We reach the end of cifs_put_tcon(), we free the tcon object despite
it now having a refcount of 1.
* The queued work now runs, but the tcon, ses & server was freed in
the meantime resulting in a crash.
THREAD 1
========
cifs_put_tcon => tcon refcount reach 0
SMB2_tdis
close_shroot_lease
close_shroot_lease_locked => if cached root has lease && refcount = 0
smb2_close_cached_fid => if cached root valid
SMB2_close => retry close in a thread if interrupted
smb2_handle_cancelled_close
__smb2_handle_cancelled_close => !! tcon refcount bump 0 => 1 !!
INIT_WORK(&cancelled->work, smb2_cancelled_close_fid);
queue_work(cifsiod_wq, &cancelled->work) => queue work
tconInfoFree(tcon); ==> freed!
cifs_put_smb_ses(ses); ==> freed!
THREAD 2 (workqueue)
========
smb2_cancelled_close_fid
SMB2_close(0, cancelled->tcon, ...); => use-after-free of tcon
cifs_put_tcon(cancelled->tcon); => tcon refcount reach 0 second time
*CRASH*
Fixes: d919131935
("CIFS: Close cached root handle only if it has a lease")
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
This commit is contained in:
parent
645c248d6f
commit
e79b0332ae
|
@ -766,6 +766,20 @@ smb2_handle_cancelled_close(struct cifs_tcon *tcon, __u64 persistent_fid,
|
||||||
|
|
||||||
cifs_dbg(FYI, "%s: tc_count=%d\n", __func__, tcon->tc_count);
|
cifs_dbg(FYI, "%s: tc_count=%d\n", __func__, tcon->tc_count);
|
||||||
spin_lock(&cifs_tcp_ses_lock);
|
spin_lock(&cifs_tcp_ses_lock);
|
||||||
|
if (tcon->tc_count <= 0) {
|
||||||
|
struct TCP_Server_Info *server = NULL;
|
||||||
|
|
||||||
|
WARN_ONCE(tcon->tc_count < 0, "tcon refcount is negative");
|
||||||
|
spin_unlock(&cifs_tcp_ses_lock);
|
||||||
|
|
||||||
|
if (tcon->ses)
|
||||||
|
server = tcon->ses->server;
|
||||||
|
|
||||||
|
cifs_server_dbg(FYI, "tid=%u: tcon is closing, skipping async close retry of fid %llu %llu\n",
|
||||||
|
tcon->tid, persistent_fid, volatile_fid);
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
tcon->tc_count++;
|
tcon->tc_count++;
|
||||||
spin_unlock(&cifs_tcp_ses_lock);
|
spin_unlock(&cifs_tcp_ses_lock);
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue