From bad115cfe5b509043b684d3a007ab54b80090aa1 Mon Sep 17 00:00:00 2001 From: Willy Tarreau Date: Thu, 17 May 2012 11:14:14 +0000 Subject: [PATCH 1/2] tcp: do_tcp_sendpages() must try to push data out on oom conditions Since recent changes on TCP splicing (starting with commits 2f533844 "tcp: allow splice() to build full TSO packets" and 35f9c09f "tcp: tcp_sendpages() should call tcp_push() once"), I started seeing massive stalls when forwarding traffic between two sockets using splice() when pipe buffers were larger than socket buffers. Latest changes (net: netdev_alloc_skb() use build_skb()) made the problem even more apparent. The reason seems to be that if do_tcp_sendpages() fails on out of memory condition without being able to send at least one byte, tcp_push() is not called and the buffers cannot be flushed. After applying the attached patch, I cannot reproduce the stalls at all and the data rate it perfectly stable and steady under any condition which previously caused the problem to be permanent. The issue seems to have been there since before the kernel migrated to git, which makes me think that the stalls I occasionally experienced with tux during stress-tests years ago were probably related to the same issue. This issue was first encountered on 3.0.31 and 3.2.17, so please backport to -stable. Signed-off-by: Willy Tarreau Acked-by: Eric Dumazet Cc: --- net/ipv4/tcp.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 1272a88c2a63..6589e11d57b6 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -851,8 +851,7 @@ static ssize_t do_tcp_sendpages(struct sock *sk, struct page **pages, int poffse wait_for_sndbuf: set_bit(SOCK_NOSPACE, &sk->sk_socket->flags); wait_for_memory: - if (copied) - tcp_push(sk, flags & ~MSG_MORE, mss_now, TCP_NAGLE_PUSH); + tcp_push(sk, flags & ~MSG_MORE, mss_now, TCP_NAGLE_PUSH); if ((err = sk_stream_wait_memory(sk, &timeo)) != 0) goto do_error; From 8ce6909f77ba1b7bcdea65cc2388fd1742b6d669 Mon Sep 17 00:00:00 2001 From: Tushar Dave Date: Thu, 17 May 2012 01:04:50 +0000 Subject: [PATCH 2/2] e1000: Prevent reset task killing itself. Killing reset task while adapter is resetting causes deadlock. Only kill reset task if adapter is not resetting. Ref bug #43132 on bugzilla.kernel.org CC: stable@vger.kernel.org Signed-off-by: Tushar Dave Tested-by: Aaron Brown Signed-off-by: Jeff Kirsher Signed-off-by: David S. Miller --- drivers/net/ethernet/intel/e1000/e1000_main.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c index 37caa8885c2a..8d8908d2a9b1 100644 --- a/drivers/net/ethernet/intel/e1000/e1000_main.c +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c @@ -493,7 +493,11 @@ static void e1000_power_down_phy(struct e1000_adapter *adapter) static void e1000_down_and_stop(struct e1000_adapter *adapter) { set_bit(__E1000_DOWN, &adapter->flags); - cancel_work_sync(&adapter->reset_task); + + /* Only kill reset task if adapter is not resetting */ + if (!test_bit(__E1000_RESETTING, &adapter->flags)) + cancel_work_sync(&adapter->reset_task); + cancel_delayed_work_sync(&adapter->watchdog_task); cancel_delayed_work_sync(&adapter->phy_info_task); cancel_delayed_work_sync(&adapter->fifo_stall_task);