linux/net
Pengcheng Yang e176b1ba47 tcp: fix marked lost packets not being retransmitted
When the packet pointed to by retransmit_skb_hint is unlinked by ACK,
retransmit_skb_hint will be set to NULL in tcp_clean_rtx_queue().
If packet loss is detected at this time, retransmit_skb_hint will be set
to point to the current packet loss in tcp_verify_retransmit_hint(),
then the packets that were previously marked lost but not retransmitted
due to the restriction of cwnd will be skipped and cannot be
retransmitted.

To fix this, when retransmit_skb_hint is NULL, retransmit_skb_hint can
be reset only after all marked lost packets are retransmitted
(retrans_out >= lost_out), otherwise we need to traverse from
tcp_rtx_queue_head in tcp_xmit_retransmit_queue().

Packetdrill to demonstrate:

// Disable RACK and set max_reordering to keep things simple
    0 `sysctl -q net.ipv4.tcp_recovery=0`
   +0 `sysctl -q net.ipv4.tcp_max_reordering=3`

// Establish a connection
   +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
   +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
   +0 bind(3, ..., ...) = 0
   +0 listen(3, 1) = 0

  +.1 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
   +0 > S. 0:0(0) ack 1 <...>
 +.01 < . 1:1(0) ack 1 win 257
   +0 accept(3, ..., ...) = 4

// Send 8 data segments
   +0 write(4, ..., 8000) = 8000
   +0 > P. 1:8001(8000) ack 1

// Enter recovery and 1:3001 is marked lost
 +.01 < . 1:1(0) ack 1 win 257 <sack 3001:4001,nop,nop>
   +0 < . 1:1(0) ack 1 win 257 <sack 5001:6001 3001:4001,nop,nop>
   +0 < . 1:1(0) ack 1 win 257 <sack 5001:7001 3001:4001,nop,nop>

// Retransmit 1:1001, now retransmit_skb_hint points to 1001:2001
   +0 > . 1:1001(1000) ack 1

// 1001:2001 was ACKed causing retransmit_skb_hint to be set to NULL
 +.01 < . 1:1(0) ack 2001 win 257 <sack 5001:8001 3001:4001,nop,nop>
// Now retransmit_skb_hint points to 4001:5001 which is now marked lost

// BUG: 2001:3001 was not retransmitted
   +0 > . 2001:3001(1000) ack 1

Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Tested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-15 22:34:31 +01:00
..
6lowpan 6lowpan: no need to check return value of debugfs_create functions 2019-07-06 12:50:01 +02:00
9p 9p pull request for inclusion in 5.4 2019-09-27 15:10:34 -07:00
802 treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
8021q vlan: vlan_changelink() should propagate errors 2020-01-07 13:35:14 -08:00
appletalk appletalk: enforce CAP_NET_RAW for raw sockets 2019-09-24 16:37:18 +02:00
atm Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-11-22 16:27:24 -08:00
ax25 net: use helpers to change sk_ack_backlog 2019-11-06 16:14:48 -08:00
batman-adv treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
bluetooth compat_ioctl: remove most of fs/compat_ioctl.c 2019-12-01 13:46:15 -08:00
bpf treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
bpfilter Kbuild updates for v5.3 2019-07-12 16:03:16 -07:00
bridge Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf 2019-12-26 13:11:40 -08:00
caif Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-11-02 13:54:56 -07:00
can can: j1939: j1939_sk_bind(): take priv after lock is held 2019-12-08 11:52:02 +01:00
ceph libceph, rbd, ceph: convert to use the new mount API 2019-11-27 22:28:37 +01:00
core devlink: correct misspelling of snapshot 2020-01-11 14:30:24 -08:00
dcb
dccp treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
decnet net: add bool confirm_neigh parameter for dst_ops.update_pmtu 2019-12-24 22:28:54 -08:00
dns_resolver Revert "Merge tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs" 2019-07-10 18:43:43 -07:00
dsa net: dsa: ksz: use common define for tag len 2019-12-20 21:06:49 -08:00
ethernet net: add annotations on hh->hh_len lockless accesses 2019-11-07 20:07:30 -08:00
hsr hsr: fix slab-out-of-bounds Read in hsr_debugfs_rename() 2019-12-30 20:36:27 -08:00
ieee802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-11-02 13:54:56 -07:00
ife net: Fix Kconfig indentation 2019-09-26 08:56:17 +02:00
ipv4 tcp: fix marked lost packets not being retransmitted 2020-01-15 22:34:31 +01:00
ipv6 sit: do not confirm neighbor when do pmtu update 2019-12-24 22:28:55 -08:00
iucv treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
kcm kcm: disable preemption in kcm_parse_func_strparser() 2019-09-27 10:27:14 +02:00
key Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-07-08 19:48:57 -07:00
l2tp net: ipv6: add net argument to ip6_dst_lookup_flow 2019-12-04 12:27:12 -08:00
l3mdev ipv6: convert major tx path to use RT6_LOOKUP_F_DST_NOREF 2019-06-23 13:24:17 -07:00
lapb
llc llc2: Fix return statement of llc_stat_ev_rx_null_dsap_xid_c (and _test_c) 2019-12-20 21:19:36 -08:00
mac80211 mac80211: Fix TKIP replay protection immediately after key setup 2020-01-15 09:52:12 +01:00
mac802154
mpls net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookup 2019-12-04 12:27:13 -08:00
ncsi net/ncsi: Disable global multicast filter 2019-09-19 18:04:40 -07:00
netfilter netfilter: ipset: avoid null deref when IPSET_ATTR_LINENO is present 2020-01-08 23:31:46 +01:00
netlabel netlabel: remove redundant assignment to pointer iter 2019-09-01 11:45:02 -07:00
netlink treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
netrom net: core: add generic lockdep keys 2019-10-24 14:53:48 -07:00
nfc net: nfc: nci: fix a possible sleep-in-atomic-context bug in nci_uart_tty_receive() 2019-12-18 11:57:33 -08:00
nsh
openvswitch treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
packet af_packet: set defaule value for tmo 2019-12-09 14:30:19 -08:00
phonet net: use skb_queue_empty_lockless() in poll() handlers 2019-10-28 13:33:41 -07:00
psample net: psample: fix skb_over_panic 2019-11-26 14:40:13 -08:00
qrtr net: qrtr: fix len of skb_put_padto in qrtr_node_enqueue 2020-01-05 14:46:05 -08:00
rds Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-11-16 21:51:42 -08:00
rfkill rfkill: Fix incorrect check to avoid NULL pointer dereference 2019-12-16 10:15:49 +01:00
rose net: use helpers to change sk_ack_backlog 2019-11-06 16:14:48 -08:00
rxrpc RxRPC fixes 2019-12-24 16:12:47 -08:00
sched net: sch_prio: When ungrafting, replace with FIFO 2020-01-08 12:45:53 -08:00
sctp sctp: free cmd->obj.chunk for the unprocessed SCTP_CMD_REPLY 2020-01-06 13:28:37 -08:00
smc net/smc: unregister ib devices in reboot_event 2019-12-20 21:31:19 -08:00
strparser
sunrpc This is a relatively quiet cycle for nfsd, mainly various bugfixes. 2019-12-07 16:56:00 -08:00
switchdev
tipc tipc: fix wrong connect() return code 2020-01-08 15:57:35 -08:00
tls net/tls: fix async operation 2020-01-10 11:20:46 -08:00
unix treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
vmw_vsock hv_sock: Remove the accept port restriction 2020-01-14 11:50:53 -08:00
wimax wimax: no need to check return value of debugfs_create functions 2019-08-10 15:25:47 -07:00
wireless cfg80211: fix page refcount issue in A-MSDU decap 2020-01-15 09:53:35 +01:00
x25 net/x25: fix nonblocking connect 2020-01-09 18:39:33 -08:00
xdp xsk: Add rcu_read_lock around the XSK wakeup 2019-12-19 16:20:48 +01:00
xfrm Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2019-11-25 20:02:57 -08:00
Kconfig net: Fix Kconfig indentation, continued 2019-11-21 12:00:21 -08:00
Makefile
compat.c y2038: socket: use __kernel_old_timespec instead of timespec 2019-11-15 14:38:29 +01:00
socket.c io_uring-5.5-20191212 2019-12-13 14:24:54 -08:00
sysctl_net.c