If control channel MTU is too low to support map operations a warning
will be printed. This is not enough, we want to make sure probe fails
in such scenario, as this would clearly be a faulty configuration.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
In preparation for unifying the skb_frag and bio_vec, use the fine
accessors which already exist and use skb_frag_t instead of
struct skb_frag_struct.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
If driver has to drop the TLS frame it needs to undo the TCP
sequence tracking changes, otherwise device will receive
segments out of order and drop them.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the #ifdef CONFIG_TLS_DEVICE a little so we can eliminate
the other one.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Count the number of successfully submitted TLS segments,
not skbs. This will make it easier to compare the TLS
encryption count against other counters.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When TCP stream gets out of sync (driver stops receiving skbs
with expected TCP sequence numbers) request a TX resync from
the kernel.
We try to distinguish retransmissions from missed transmissions
by comparing the sequence number to expected - if it's further
than the expected one - we probably missed packets.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some control messages must be sent from atomic context. The mailbox
takes sleeping locks and uses a waitqueue so add a "posted" version
of communication.
Trylock the semaphore and if that's successful kick of the device
communication. The device communication will be completed from
a workqueue, which will also release the semaphore.
If locks are taken queue the message and return. Schedule a
different workqueue to take the semaphore and run the communication.
Note that the there are currently no atomic users which would actually
need the return value, so all replies to posted messages are just
freed.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Firmware indicates when a packet has been decrypted by reusing the
currently unused BPF flag. Transfer this information into the skb
and provide a statistic of all decrypted segments.
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Packets received at the NFP driver may be redirected to egress of another
netdev (e.g. in the case of OvS internal ports). On the egress path, some
processes, like TC egress hooks, may expect the network header offset
field in the skb to be correctly set. If this is not the case there is
potential for abnormal behaviour and even the triggering of BUG() calls.
Set the skb network header field before the mac header pull when doing a
packet redirect.
Fixes: 27f54b5825 ("nfp: allow fallback packets from non-reprs")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Count TX TLS packets: successes, out of order, and dropped due to
missing record info. Make sure the RX and TX completion statistics
don't share cache lines with TX ones as much as possible. With TLS
stats they are no longer reasonably aligned.
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prepend connection handle to each transmitted TLS packet.
For each connection, the driver tracks the next sequence number
expected. If an out of order packet is observed, the driver calls into
the TLS kernel code to reencrypt that particular skb.
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subsequent patches will add support for more TX metadata fields.
Prepare for this by handling an additional double word - firmware
handle as metadata type 7.
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add FW ABI defines and code for basic init of TLS offload.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FW may prefer to handle some communication via a mailbox
or the vNIC may simply not have a control queue (VFs).
Add a way of exchanging ccm-compatible messages via a
mailbox.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We will need to release the bar lock from a workqueue
so move from a mutex to a semaphore. This lock should
not be too hot. Unfortunately semaphores don't have
lockdep support.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently if we need to modify the head of the skb and allocation
fails we would free the skb and not increment the error counter.
Make sure all errors are counted.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
By default VFs are not trusted. Add ndo_set_vf_trust support to toggle
a new per-VF bit. Coupled with FW with this capability allows a
trusted VF to change its MAC even after being administratively set by
the PF. Also populate the trusted field on ndo_get_vf_config. Add the
same ndo to the representors.
Signed-off-by: Pablo Cascón <pablo.cascon@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, it is assumed that fallback packets will be from reprs. Modify
this to allow an app to receive non-repr ports from the fallback channel -
e.g. from an internal port. If such a packet is received, do not update
repr stats.
Change the naming function calls so as not to imply it will always be a
repr netdev returned. Add the option to set a bool value to redirect a
fallback packet out the returned port rather than RXing it. Setting of
this bool in subsequent patches allows the handling of packets falling
back when they are due to egress an internal port.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During probe we clear vNIC configuration in case the device
wasn't closed cleanly by previous driver. Move that code
before netdev init, so netdev init can already try to apply
its config parameters.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Soon we will try to write to the vNIC mailbox without RTNL held.
Add a new mutex to protect access to specific parts of the PCI
control BAR.
Move the mailbox size checking to the mailbox lock() helper, where
it can be more effective (happen prior to potential overwrite of
other data).
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the reconfig was a quick update, we could have results available from
firmware within 200us.
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove implementation of get_port_parent_id ndo and rely on core calling
into devlink for the information directly.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two reasons for this.
First, the xmit_more flag conceptually doesn't fit into the skb, as
xmit_more is not a property related to the skb.
Its only a hint to the driver that the stack is about to transmit another
packet immediately.
Second, it was only done this way to not have to pass another argument
to ndo_start_xmit().
We can place xmit_more in the softnet data, next to the device recursion.
The recursion counter is already written to on each transmit. The "more"
indicator is placed right next to it.
Drivers can use the netdev_xmit_more() helper instead of skb->xmit_more
to check the "more packets coming" hint.
skb->xmit_more is retained (but always 0) to not cause build breakage.
This change takes care of the simple s/skb->xmit_more/netdev_xmit_more()/
conversions. Remaining drivers are converted in the next patches.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
If nn->port is defined it means that devlink_port has been registered
for this port as well. Devlink core is handling the port name
formatting.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Follow-up patch is going to need a devlink port instance according to
a netdev. Devlink port instance should be always available when devlink
is used. So change the recently introduced ndo_get_devlink to
ndo_get_devlink_port. With that, adjust the wrapper for the only
user to get devlink pointer.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NFP driver ABI contains bits for L2 switching which were never
implemented in initially envisioned form.
Remove the defines, and open up the possibility of
reclaiming the bits for other uses.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Support getting devlink instance from a new NDO.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NFP only supports SWITCHDEV_ATTR_ID_PORT_PARENT_ID, which makes it a
great candidate to be converted to use the ndo_get_port_parent_id() NDO
instead of implementing switchdev_port_attr_get().
Since NFP uses switchdev_port_same_parent_id() convert it to use
netdev_port_same_parent_id().
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We already need to zero out memory for dma_alloc_coherent(), as such
using dma_zalloc_coherent() is superflous. Phase it out.
This change was generated with the following Coccinelle SmPL patch:
@ replace_dma_zalloc_coherent @
expression dev, size, data, handle, flags;
@@
-dma_zalloc_coherent(dev, size, handle, flags)
+dma_alloc_coherent(dev, size, handle, flags)
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
[hch: re-ran the script on the latest tree]
Signed-off-by: Christoph Hellwig <hch@lst.de>
FW reconfiguration timeouts are a common indicator of FW trouble.
To make debugging easier print requested update and control word
when reconfiguration fails.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for TSO over representors make sure the port id
prepend will always fit in the frame. The current max header
length is 255, which is ample, so assume worst case scenario
of 8 byte prepend and save ourselves the conditionals.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TSO-related offsets in the descriptor should not include
the length of the prepended metadata. Adjust them. Note that
this could not have caused issues in the past as we don't
support TSO with metadata prepend as of this patch.
Signed-off-by: Michael Rapson <michael.rapson@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
nd_q is only used at the very end of nfp_net_tx(), there is no need
to initialize it early.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move temporary variables in scope of the loop in nfp_net_tx_complete(),
and add a temp for txbuf software structure. This saves us 0.2% of CPU.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chained descriptors for fragments need to duplicate all the descriptor
fields of the skb head, so we copy the descriptor and then modify the
relevant fields. This is wasteful, because the top half of the descriptor
will get overwritten entirely while the bottom half is not modified at all.
Copy only the bottom half. This saves us 0.3% of CPU in a GSO test.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Learn how to set the DSCP map. FW uses a packed array which
geometry depends on the number of supported priorities and
virtual queues. Write code to assemble this map and to communicate
the setting to the FW via mailbox.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
__netdev_tx_sent_queue() was added in commit e59020abf0f
("net: bql: add __netdev_tx_sent_queue()") and allows for
better GSO performance.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In certain cases initialization logic which follows allocation of
the vNIC structure may want to validate the capabilities of that vNIC.
This is easy before vNIC is initialized for normal capabilities which
are at fixed offsets in control memory, easy to locate and read, but
poses a challenge if the capabilities are in form of TLVs. Parse
the TLVs early on so other code can just access parsed info, instead
of having to do the parsing by itself.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move setting ctrl_bar pointer to the nfp_net_alloc function,
to make sure we can parse capabilities early in the following
patch.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace the repeated license text with SDPX identifiers.
While at it bump the Copyright dates for files we touched
this year.
Signed-off-by: Edwin Peer <edwin.peer@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Nic Viljoen <nick.viljoen@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov says:
====================
pull-request: bpf-next 2018-10-08
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) sk_lookup_[tcp|udp] and sk_release helpers from Joe Stringer which allow
BPF programs to perform lookups for sockets in a network namespace. This would
allow programs to determine early on in processing whether the stack is
expecting to receive the packet, and perform some action (eg drop,
forward somewhere) based on this information.
2) per-cpu cgroup local storage from Roman Gushchin.
Per-cpu cgroup local storage is very similar to simple cgroup storage
except all the data is per-cpu. The main goal of per-cpu variant is to
implement super fast counters (e.g. packet counters), which don't require
neither lookups, neither atomic operations in a fast path.
The example of these hybrid counters is in selftests/bpf/netcnt_prog.c
3) allow HW offload of programs with BPF-to-BPF function calls from Quentin Monnet
4) support more than 64-byte key/value in HW offloaded BPF maps from Jakub Kicinski
5) rename of libbpf interfaces from Andrey Ignatov.
libbpf is maturing as a library and should follow good practices in
library design and implementation to play well with other libraries.
This patch set brings consistent naming convention to global symbols.
6) relicense libbpf as LGPL-2.1 OR BSD-2-Clause from Alexei Starovoitov
to let Apache2 projects use libbpf
7) various AF_XDP fixes from Björn and Magnus
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Minor conflict in net/core/rtnetlink.c, David Ahern's bug fix in 'net'
overlapped the renaming of a netlink attribute in net-next.
Signed-off-by: David S. Miller <davem@davemloft.net>
When FW floods the driver with control messages try to exit the cmsg
processing loop every now and then to avoid soft lockups. Cmsg
processing is generally very lightweight so 512 seems like a reasonable
budget, which should not be exceeded under normal conditions.
Fixes: 77ece8d5f1 ("nfp: add control vNIC datapath")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some apps may want to have higher MTU on the control vNIC/queue.
Allow them to set the requested MTU at init time.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Version bump conflict in batman-adv, take what's in net-next.
iavf conflict, adjustment of netdev_ops in net-next conflicting
with poll controller method removal in net.
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
nfp uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Tested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NFP supports fairly enormous ring sizes (up to 256k descriptors).
In commit 4662717038 ("nfp: use kvcalloc() to allocate SW buffer
descriptor arrays") we have started using kvcalloc() functions to
make sure the allocation of software state arrays doesn't hit
the MAX_ORDER limit. Unfortunately, we can't use virtual mappings
for the DMA region holding HW descriptors. In case this allocation
fails instead of the generic (and fairly scary) warning/splat in
the logs print a helpful message explaining what happened and
suggesting how to fix it.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Report in standard netdev stats drops and errors as well as
RX multicast from the FW vNIC counters.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As you are already in a tasklet, it is unnecessary to call spin_lock_bh.
Signed-off-by: jun qian <hangdianqj@163.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>