net/sunrpc/xprt_sock: fix regression in connection error reporting.

Commit 3d4762639d ("tcp: remove poll() flakes when receiving
RST") in v4.12 changed the order in which ->sk_state_change()
and ->sk_error_report() are called when a socket is shut
down - sk_state_change() is now called first.

This causes xs_tcp_state_change() -> xs_sock_mark_closed() ->
xprt_disconnect_done() to wake all pending tasked with -EAGAIN.
When the ->sk_error_report() callback arrives, it is too late to
pass the error on, and it is lost.

As easy way to demonstrate the problem caused is to try to start
rpc.nfsd while rcpbind isn't running.
nfsd will attempt a tcp connection to rpcbind.  A ECONNREFUSED
error is returned, but sunrpc code loses the error and keeps
retrying.  If it saw the ECONNREFUSED, it would abort.

To fix this, handle the sk->sk_err in the TCP_CLOSE branch of
xs_tcp_state_change().

Fixes: 3d4762639d ("tcp: remove poll() flakes when receiving RST")
Cc: stable@vger.kernel.org (v4.12)
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
This commit is contained in:
NeilBrown 2017-07-19 14:05:01 +10:00 committed by Anna Schumaker
parent ecc7b435d2
commit 3ffbc1d655
1 changed files with 2 additions and 0 deletions

View File

@ -1624,6 +1624,8 @@ static void xs_tcp_state_change(struct sock *sk)
if (test_and_clear_bit(XPRT_SOCK_CONNECTING, if (test_and_clear_bit(XPRT_SOCK_CONNECTING,
&transport->sock_state)) &transport->sock_state))
xprt_clear_connecting(xprt); xprt_clear_connecting(xprt);
if (sk->sk_err)
xprt_wake_pending_tasks(xprt, -sk->sk_err);
xs_sock_mark_closed(xprt); xs_sock_mark_closed(xprt);
} }
out: out: