add documents for snmp counters
Add explaination of below counters: TcpExtTCPRcvCoalesce TcpExtTCPAutoCorking TcpExtTCPOrigDataSent TCPSynRetrans TCPFastOpenActiveFail TcpExtListenOverflows TcpExtListenDrops TcpExtTCPHystartTrainDetect TcpExtTCPHystartTrainCwnd TcpExtTCPHystartDelayDetect TcpExtTCPHystartDelayCwnd Signed-off-by: yupeng <yupeng0921@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
parent
50853808ff
commit
712ee16c23
|
@ -220,6 +220,68 @@ Defined in `RFC1213 tcpPassiveOpens`_
|
||||||
It means the TCP layer receives a SYN, replies a SYN+ACK, come into
|
It means the TCP layer receives a SYN, replies a SYN+ACK, come into
|
||||||
the SYN-RCVD state.
|
the SYN-RCVD state.
|
||||||
|
|
||||||
|
* TcpExtTCPRcvCoalesce
|
||||||
|
When packets are received by the TCP layer and are not be read by the
|
||||||
|
application, the TCP layer will try to merge them. This counter
|
||||||
|
indicate how many packets are merged in such situation. If GRO is
|
||||||
|
enabled, lots of packets would be merged by GRO, these packets
|
||||||
|
wouldn't be counted to TcpExtTCPRcvCoalesce.
|
||||||
|
|
||||||
|
* TcpExtTCPAutoCorking
|
||||||
|
When sending packets, the TCP layer will try to merge small packets to
|
||||||
|
a bigger one. This counter increase 1 for every packet merged in such
|
||||||
|
situation. Please refer to the LWN article for more details:
|
||||||
|
https://lwn.net/Articles/576263/
|
||||||
|
|
||||||
|
* TcpExtTCPOrigDataSent
|
||||||
|
This counter is explained by `kernel commit f19c29e3e391`_, I pasted the
|
||||||
|
explaination below::
|
||||||
|
|
||||||
|
TCPOrigDataSent: number of outgoing packets with original data (excluding
|
||||||
|
retransmission but including data-in-SYN). This counter is different from
|
||||||
|
TcpOutSegs because TcpOutSegs also tracks pure ACKs. TCPOrigDataSent is
|
||||||
|
more useful to track the TCP retransmission rate.
|
||||||
|
|
||||||
|
* TCPSynRetrans
|
||||||
|
This counter is explained by `kernel commit f19c29e3e391`_, I pasted the
|
||||||
|
explaination below::
|
||||||
|
|
||||||
|
TCPSynRetrans: number of SYN and SYN/ACK retransmits to break down
|
||||||
|
retransmissions into SYN, fast-retransmits, timeout retransmits, etc.
|
||||||
|
|
||||||
|
* TCPFastOpenActiveFail
|
||||||
|
This counter is explained by `kernel commit f19c29e3e391`_, I pasted the
|
||||||
|
explaination below::
|
||||||
|
|
||||||
|
TCPFastOpenActiveFail: Fast Open attempts (SYN/data) failed because
|
||||||
|
the remote does not accept it or the attempts timed out.
|
||||||
|
|
||||||
|
.. _kernel commit f19c29e3e391: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f19c29e3e391a66a273e9afebaf01917245148cd
|
||||||
|
|
||||||
|
* TcpExtListenOverflows and TcpExtListenDrops
|
||||||
|
When kernel receives a SYN from a client, and if the TCP accept queue
|
||||||
|
is full, kernel will drop the SYN and add 1 to TcpExtListenOverflows.
|
||||||
|
At the same time kernel will also add 1 to TcpExtListenDrops. When a
|
||||||
|
TCP socket is in LISTEN state, and kernel need to drop a packet,
|
||||||
|
kernel would always add 1 to TcpExtListenDrops. So increase
|
||||||
|
TcpExtListenOverflows would let TcpExtListenDrops increasing at the
|
||||||
|
same time, but TcpExtListenDrops would also increase without
|
||||||
|
TcpExtListenOverflows increasing, e.g. a memory allocation fail would
|
||||||
|
also let TcpExtListenDrops increase.
|
||||||
|
|
||||||
|
Note: The above explanation is based on kernel 4.10 or above version, on
|
||||||
|
an old kernel, the TCP stack has different behavior when TCP accept
|
||||||
|
queue is full. On the old kernel, TCP stack won't drop the SYN, it
|
||||||
|
would complete the 3-way handshake. As the accept queue is full, TCP
|
||||||
|
stack will keep the socket in the TCP half-open queue. As it is in the
|
||||||
|
half open queue, TCP stack will send SYN+ACK on an exponential backoff
|
||||||
|
timer, after client replies ACK, TCP stack checks whether the accept
|
||||||
|
queue is still full, if it is not full, moves the socket to the accept
|
||||||
|
queue, if it is full, keeps the socket in the half-open queue, at next
|
||||||
|
time client replies ACK, this socket will get another chance to move
|
||||||
|
to the accept queue.
|
||||||
|
|
||||||
|
|
||||||
TCP Fast Open
|
TCP Fast Open
|
||||||
============
|
============
|
||||||
When kernel receives a TCP packet, it has two paths to handler the
|
When kernel receives a TCP packet, it has two paths to handler the
|
||||||
|
@ -331,6 +393,38 @@ TcpExtTCPAbortFailed will be increased.
|
||||||
|
|
||||||
.. _RFC2525 2.17 section: https://tools.ietf.org/html/rfc2525#page-50
|
.. _RFC2525 2.17 section: https://tools.ietf.org/html/rfc2525#page-50
|
||||||
|
|
||||||
|
TCP Hybrid Slow Start
|
||||||
|
====================
|
||||||
|
The Hybrid Slow Start algorithm is an enhancement of the traditional
|
||||||
|
TCP congestion window Slow Start algorithm. It uses two pieces of
|
||||||
|
information to detect whether the max bandwidth of the TCP path is
|
||||||
|
approached. The two pieces of information are ACK train length and
|
||||||
|
increase in packet delay. For detail information, please refer the
|
||||||
|
`Hybrid Slow Start paper`_. Either ACK train length or packet delay
|
||||||
|
hits a specific threshold, the congestion control algorithm will come
|
||||||
|
into the Congestion Avoidance state. Until v4.20, two congestion
|
||||||
|
control algorithms are using Hybrid Slow Start, they are cubic (the
|
||||||
|
default congestion control algorithm) and cdg. Four snmp counters
|
||||||
|
relate with the Hybrid Slow Start algorithm.
|
||||||
|
|
||||||
|
.. _Hybrid Slow Start paper: https://pdfs.semanticscholar.org/25e9/ef3f03315782c7f1cbcd31b587857adae7d1.pdf
|
||||||
|
|
||||||
|
* TcpExtTCPHystartTrainDetect
|
||||||
|
How many times the ACK train length threshold is detected
|
||||||
|
|
||||||
|
* TcpExtTCPHystartTrainCwnd
|
||||||
|
The sum of CWND detected by ACK train length. Dividing this value by
|
||||||
|
TcpExtTCPHystartTrainDetect is the average CWND which detected by the
|
||||||
|
ACK train length.
|
||||||
|
|
||||||
|
* TcpExtTCPHystartDelayDetect
|
||||||
|
How many times the packet delay threshold is detected.
|
||||||
|
|
||||||
|
* TcpExtTCPHystartDelayCwnd
|
||||||
|
The sum of CWND detected by packet delay. Dividing this value by
|
||||||
|
TcpExtTCPHystartDelayDetect is the average CWND which detected by the
|
||||||
|
packet delay.
|
||||||
|
|
||||||
examples
|
examples
|
||||||
=======
|
=======
|
||||||
|
|
||||||
|
@ -743,3 +837,111 @@ After run client_linger.py, check the output of nstat::
|
||||||
|
|
||||||
nstatuser@nstat-a:~$ nstat | grep -i abort
|
nstatuser@nstat-a:~$ nstat | grep -i abort
|
||||||
TcpExtTCPAbortOnLinger 1 0.0
|
TcpExtTCPAbortOnLinger 1 0.0
|
||||||
|
|
||||||
|
TcpExtTCPRcvCoalesce
|
||||||
|
-------------------
|
||||||
|
On the server, we run a program which listen on TCP port 9000, but
|
||||||
|
doesn't read any data::
|
||||||
|
|
||||||
|
import socket
|
||||||
|
import time
|
||||||
|
port = 9000
|
||||||
|
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
|
||||||
|
s.bind(('0.0.0.0', port))
|
||||||
|
s.listen(1)
|
||||||
|
sock, addr = s.accept()
|
||||||
|
while True:
|
||||||
|
time.sleep(9999999)
|
||||||
|
|
||||||
|
Save the above code as server_coalesce.py, and run::
|
||||||
|
|
||||||
|
python3 server_coalesce.py
|
||||||
|
|
||||||
|
On the client, save below code as client_coalesce.py::
|
||||||
|
|
||||||
|
import socket
|
||||||
|
server = 'nstat-b'
|
||||||
|
port = 9000
|
||||||
|
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
|
||||||
|
s.connect((server, port))
|
||||||
|
|
||||||
|
Run::
|
||||||
|
|
||||||
|
nstatuser@nstat-a:~$ python3 -i client_coalesce.py
|
||||||
|
|
||||||
|
We use '-i' to come into the interactive mode, then a packet::
|
||||||
|
|
||||||
|
>>> s.send(b'foo')
|
||||||
|
3
|
||||||
|
|
||||||
|
Send a packet again::
|
||||||
|
|
||||||
|
>>> s.send(b'bar')
|
||||||
|
3
|
||||||
|
|
||||||
|
On the server, run nstat::
|
||||||
|
|
||||||
|
ubuntu@nstat-b:~$ nstat
|
||||||
|
#kernel
|
||||||
|
IpInReceives 2 0.0
|
||||||
|
IpInDelivers 2 0.0
|
||||||
|
IpOutRequests 2 0.0
|
||||||
|
TcpInSegs 2 0.0
|
||||||
|
TcpOutSegs 2 0.0
|
||||||
|
TcpExtTCPRcvCoalesce 1 0.0
|
||||||
|
IpExtInOctets 110 0.0
|
||||||
|
IpExtOutOctets 104 0.0
|
||||||
|
IpExtInNoECTPkts 2 0.0
|
||||||
|
|
||||||
|
The client sent two packets, server didn't read any data. When
|
||||||
|
the second packet arrived at server, the first packet was still in
|
||||||
|
the receiving queue. So the TCP layer merged the two packets, and we
|
||||||
|
could find the TcpExtTCPRcvCoalesce increased 1.
|
||||||
|
|
||||||
|
TcpExtListenOverflows and TcpExtListenDrops
|
||||||
|
----------------------------------------
|
||||||
|
On server, run the nc command, listen on port 9000::
|
||||||
|
|
||||||
|
nstatuser@nstat-b:~$ nc -lkv 0.0.0.0 9000
|
||||||
|
Listening on [0.0.0.0] (family 0, port 9000)
|
||||||
|
|
||||||
|
On client, run 3 nc commands in different terminals::
|
||||||
|
|
||||||
|
nstatuser@nstat-a:~$ nc -v nstat-b 9000
|
||||||
|
Connection to nstat-b 9000 port [tcp/*] succeeded!
|
||||||
|
|
||||||
|
The nc command only accepts 1 connection, and the accept queue length
|
||||||
|
is 1. On current linux implementation, set queue length to n means the
|
||||||
|
actual queue length is n+1. Now we create 3 connections, 1 is accepted
|
||||||
|
by nc, 2 in accepted queue, so the accept queue is full.
|
||||||
|
|
||||||
|
Before running the 4th nc, we clean the nstat history on the server::
|
||||||
|
|
||||||
|
nstatuser@nstat-b:~$ nstat -n
|
||||||
|
|
||||||
|
Run the 4th nc on the client::
|
||||||
|
|
||||||
|
nstatuser@nstat-a:~$ nc -v nstat-b 9000
|
||||||
|
|
||||||
|
If the nc server is running on kernel 4.10 or higher version, you
|
||||||
|
won't see the "Connection to ... succeeded!" string, because kernel
|
||||||
|
will drop the SYN if the accept queue is full. If the nc client is running
|
||||||
|
on an old kernel, you would see that the connection is succeeded,
|
||||||
|
because kernel would complete the 3 way handshake and keep the socket
|
||||||
|
on half open queue. I did the test on kernel 4.15. Below is the nstat
|
||||||
|
on the server::
|
||||||
|
|
||||||
|
nstatuser@nstat-b:~$ nstat
|
||||||
|
#kernel
|
||||||
|
IpInReceives 4 0.0
|
||||||
|
IpInDelivers 4 0.0
|
||||||
|
TcpInSegs 4 0.0
|
||||||
|
TcpExtListenOverflows 4 0.0
|
||||||
|
TcpExtListenDrops 4 0.0
|
||||||
|
IpExtInOctets 240 0.0
|
||||||
|
IpExtInNoECTPkts 4 0.0
|
||||||
|
|
||||||
|
Both TcpExtListenOverflows and TcpExtListenDrops were 4. If the time
|
||||||
|
between the 4th nc and the nstat was longer, the value of
|
||||||
|
TcpExtListenOverflows and TcpExtListenDrops would be larger, because
|
||||||
|
the SYN of the 4th nc was dropped, the client was retrying.
|
||||||
|
|
Loading…
Reference in New Issue