linux/net
Chuck Lever 949317464b xprtrdma: Limit number of RDMA segments in RPC-over-RDMA headers
Send buffer space is shared between the RPC-over-RDMA header and
an RPC message. A large RPC-over-RDMA header means less space is
available for the associated RPC message, which then has to be
moved via an RDMA Read or Write.

As more segments are added to the chunk lists, the header increases
in size.  Typical modern hardware needs only a few segments to
convey the maximum payload size, but some devices and registration
modes may need a lot of segments to convey data payload. Sometimes
so many are needed that the remaining space in the Send buffer is
not enough for the RPC message. Sending such a message usually
fails.

To ensure a transport can always make forward progress, cap the
number of RDMA segments that are allowed in chunk lists. This
prevents less-capable devices and memory registrations from
consuming a large portion of the Send buffer by reducing the
maximum data payload that can be conveyed with such devices.

For now I choose an arbitrary maximum of 8 RDMA segments. This
allows a maximum size RPC-over-RDMA header to fit nicely in the
current 1024 byte inline threshold with over 700 bytes remaining
for an inline RPC message.

The current maximum data payload of NFS READ or WRITE requests is
one megabyte. To convey that payload on a client with 4KB pages,
each chunk segment would need to handle 32 or more data pages. This
is well within the capabilities of FMR. For physical registration,
the maximum payload size on platforms with 4KB pages is reduced to
32KB.

For FRWR, a device's maximum page list depth would need to be at
least 34 to support the maximum 1MB payload. A device with a smaller
maximum page list depth means the maximum data payload is reduced
when using that device.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:47:58 -04:00
..
6lowpan 6lowpan: iphc: fix SAM/DAM bit comment 2016-03-10 19:51:29 +01:00
9p net/9p: convert to new CQ API 2016-03-10 20:54:09 -05:00
802
8021q vlan: propagate gso_max_segs 2016-03-17 21:05:01 -04:00
appletalk appletalk: fix erroneous return value 2016-02-18 14:59:34 -05:00
atm net: Generalise wq_has_sleeper helper 2015-11-30 14:47:33 -05:00
ax25 ax25: add link layer header validation function 2016-03-09 22:13:01 -05:00
batman-adv batman-adv: Fix reference counting of hardif_neigh_node object for neigh_node 2016-04-29 19:46:11 +08:00
bluetooth Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2016-03-19 10:05:34 -07:00
bridge bridge: mdb: Marking port-group as offloaded 2016-04-24 14:23:32 -04:00
caif net: caif: fix misleading indentation 2016-03-14 13:09:50 -04:00
can can: avoid using timeval for uapi 2015-10-13 17:42:34 +02:00
ceph libceph: make authorizer destruction independent of ceph_auth_client 2016-04-25 20:54:13 +02:00
core net: Disable segmentation if checksumming is not supported 2016-05-03 16:00:54 -04:00
dcb net/dcb: make dcbnl.c explicitly non-modular 2015-10-09 07:52:27 -07:00
dccp tcp/dccp: remove obsolete WARN_ON() in icmp handlers 2016-03-17 21:06:40 -04:00
decnet decnet: Do not build routes to devices without decnet private data. 2016-04-10 23:01:30 -04:00
dns_resolver net: dns_resolver: convert time_t to time64_t 2015-11-18 16:27:46 -05:00
dsa net: dsa: refine netdev event notifier 2016-03-14 16:05:32 -04:00
ethernet eth: Pull header from first fragment via eth_get_headlen 2016-02-24 13:58:05 -05:00
hsr net/hsr: fix a warning message 2015-11-23 14:56:15 -05:00
ieee802154 ieee802154: 6lowpan: fix return of netdev notifier 2016-02-23 20:29:40 +01:00
ipv4 gre: do not pull header in ICMP error processing 2016-05-02 00:19:58 -04:00
ipv6 ipv6/ila: fix nlsize calculation for lwtunnel 2016-05-03 16:21:33 -04:00
ipx
irda Merge 4.5-rc4 into tty-next 2016-02-14 14:36:04 -08:00
iucv af_iucv: Validate socket address length in iucv_sock_bind() 2016-01-19 14:21:08 -05:00
kcm kcm: Add receive message timeout 2016-03-09 16:36:15 -05:00
key af_key: fix two typos 2015-10-23 03:05:19 -07:00
l2tp net: l2tp: fix reversed udp6 checksum flags 2016-05-01 19:32:16 -04:00
l3mdev net: l3mdev: address selection should only consider devices in L3 domain 2016-02-26 14:22:26 -05:00
lapb
llc af_llc: fix types on llc_ui_wait_for_conn 2016-02-17 16:12:13 -05:00
mac80211 mac80211: fix statistics leak if dev_alloc_name() fails 2016-04-27 10:06:58 +02:00
mac802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2016-03-19 10:05:34 -07:00
mpls mpls: find_outdev: check for err ptr in addition to NULL check 2016-04-08 12:43:20 -04:00
netfilter netfilter: nf_conntrack_tcp: Fix stack out of bounds when parsing TCP options 2016-04-07 18:42:37 +02:00
netlabel netlabel: do not initialise statics to NULL 2016-03-07 11:08:26 -05:00
netlink netlink: don't send NETLINK_URELEASE for unbound sockets 2016-04-10 23:32:23 -04:00
netrom
nfc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2016-03-19 10:05:34 -07:00
openvswitch openvswitch: use flow protocol when recalculating ipv6 checksums 2016-04-21 15:28:47 -04:00
packet packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface 2016-04-14 00:46:39 -04:00
phonet sock: struct proto hash function may error 2016-02-11 03:54:14 -05:00
rds RDS: TCP: Synchronize accept() and connect() paths on t_conn_lock. 2016-05-03 16:03:44 -04:00
rfkill Here's another round of updates for -next: 2016-03-01 17:03:27 -05:00
rose
rxrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2016-03-19 10:05:34 -07:00
sched netem: Segment GSO packets on enqueue 2016-05-03 00:33:14 -04:00
sctp sctp: avoid refreshing heartbeat timer too often 2016-04-10 22:22:34 -04:00
sunrpc xprtrdma: Limit number of RDMA segments in RPC-over-RDMA headers 2016-05-17 15:47:58 -04:00
switchdev switchdev: Adding complete operation to deferred switchdev ops 2016-04-24 14:23:32 -04:00
tipc tipc: only process unicast on intended node 2016-05-01 21:03:30 -04:00
unix Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-02-23 00:09:14 -05:00
vmw_vsock VSOCK: Only check error on skb_recv_datagram when skb is NULL 2016-04-19 20:42:01 -04:00
wimax net:wimax: Fix doucble word "the the" in networking.xml 2015-08-09 22:43:52 -07:00
wireless nl80211: check netlink protocol in socket release notification 2016-04-12 15:39:06 +02:00
x25
xfrm xfrm: Fix crash observed during device unregistration and decryption 2016-03-24 14:29:36 -04:00
Kconfig Make DST_CACHE a silent config option 2016-03-21 22:56:38 -04:00
Makefile kcm: Kernel Connection Multiplexor module 2016-03-09 16:36:14 -05:00
compat.c
socket.c net: Fix use after free in the recvmmsg exit path 2016-03-14 12:41:49 -04:00
sysctl_net.c net: sysctl: fix a kmemleak warning 2015-10-23 06:22:08 -07:00