Commit Graph

18153 Commits

Author SHA1 Message Date
Kees Cook df656bf6fb qlge: avoid format string exposure in workqueue
While unlikely, this makes sure the workqueue name won't be processed
as a format string.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 13:37:21 -07:00
Mintz, Yuval 2f78227874 qed: Correct MSI-x for storage
When qedr is enabled, qed would try dividing the msi-x vectors between
L2 and RoCE, starting with L2 and providing it with sufficient vectors
for its queues.

Problem is qed would also do that for storage partitions, and as those
don't need queues it would lead qed to award those partitions with 0
msi-x vectors, causing them to believe theye're using INTa and
preventing them from operating.

Fixes: 51ff17251c ("qed: Add support for RoCE hw init")
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 13:35:27 -07:00
Eric Dumazet 75d04aa368 mlx4: trust shinfo->gso_segs
mlx4 is the only driver in the tree making a point to recompute
shinfo->gso_segs.

Lets remove superfluous code.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 13:27:49 -07:00
Colin Ian King 827d240a23 qed: fix missing break in OOO_LB_TC case
There seems to be a missing break on the OOO_LB_TC case, pq_id
is being assigned and then re-assigned on the fall through default
case and that seems suspect.

Detected by CoverityScan, CID#1424402 ("Missing break in switch")

Fixes: b5a9ee7cf3 ("qed: Revise QM cofiguration")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 13:20:26 -07:00
Tobias Regnery 053ee0a703 net/mlx5e: fix build error without CONFIG_SYSFS
Commit 9008ae0748 ("net/mlx5e: Minimize mlx5e_{open/close}_locked")
copied the calls to netif_set_real_num_{tx,rx}_queues from
mlx5e_open_locked to mlx5e_activate_priv_channels and wraps them in an
if condition to test for netdev->real_num_{tx,rx}_queues.

But netdev->real_num_rx_queues is conditionally compiled in if CONFIG_SYSFS
is set. Without CONFIG_SYSFS the build fails:

drivers/net/ethernet/mellanox/mlx5/core/en_main.c: In function 'mlx5e_activate_priv_channels':
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2515:12: error: 'struct net_device' has no member named 'real_num_rx_queues'; did you mean 'real_num_tx_queues'?

Fix this by unconditionally call netif_set_real_num{tx,rx}_queues like before
commit 9008ae0748.

Fixes: 9008ae0748 ("net/mlx5e: Minimize mlx5e_{open/close}_locked")
Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 12:55:36 -07:00
Benjamin Herrenschmidt 10cbd64076 ftgmac100: Rework NAPI & interrupts handling
First, don't look at the interrupt status in the poll loop
to decide what to poll. It's wrong. If we have run out of
budget, we may still have RX packets to unqueue but no more
RX interrupt pending.

So instead move the code looking at the interrupt status
into the interrupt handler where it belongs. That avoids a slow
MMIO read in the NAPI fast path. We keep the abnormal interrupts
enabled while NAPI is scheduled.

While at it, actually do something useful in the "error" cases:

On AHB bus error, trigger the new reset task, that's about all
we can do. On RX packet fifo or descriptor overflows, we need
to restart the MAC after having freed things up. So set a flag
that NAPI will see and use to perform that restart after
harvesting the RX ring.

Finally, we shouldn't complete NAPI if there are still outgoing
packets that will need harvesting. Waiting for more interrupts
is less efficient than letting NAPI run a while longer while
the queue drains.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 12:38:04 -07:00
Benjamin Herrenschmidt 9b86785d1e ftgmac100: Remove useless tests in interrupt handler
The interrupt is neither enabled nor registered when the interface
isn't running (regardless of whether we use nc-si or not) so the
test isn't useful.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 12:38:04 -07:00
Benjamin Herrenschmidt 874b55bf62 ftgmac100: Rework MAC reset and init
The HW requires a full MAC reset when changing the speed.

Additionally the Aspeed documentation spells out that the
MAC needs to be reset twice with a 10us interval.

We thus move the speed setting and top level reset code
into a new ftgmac100_reset_and_config_mac() function which
handles both. Move the ring pointers initialization there
too in order to reflect the HW change.

Also reduce the timeout for the MAC reset as it shouldn't
take more than 300 clock cycles according to the doc.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 12:38:04 -07:00
Benjamin Herrenschmidt 855944ce1c ftgmac100: Add a reset task and use it for link changes
Link speed changes require a full HW reset. This isn't done
properly at the moment. It will involve delays and thus isn't
suitable to do from the link poll callback.

So let's create a reset_task that we can queue up when the
link changes. It will be useful for various cases of error
handling as well.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 12:38:04 -07:00
Benjamin Herrenschmidt da40d9d4b5 ftgmac100: Move the bulk of inits to a separate function
The link monitoring and error handling code will have to
redo the ring inits and HW setup so move the code out of
ftgmac100_open() into a dedicated function.

This forces a bit of re-ordering of ftgmac100_open() but
nothing dramatic.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 12:38:04 -07:00
Benjamin Herrenschmidt 81f1eca663 ftgmac100: Request the interrupt only after HW is reset
The interrupt isn't shared, so this will keep it masked
until we have the HW in a known sane state.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 12:38:04 -07:00
Benjamin Herrenschmidt b8dbecff9b ftgmac100: Move napi_add/del to open/close
Rather than probe/remove

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 12:38:04 -07:00
Benjamin Herrenschmidt 87d18757ec ftgmac100: Split ring alloc, init and rx buffer alloc
Currently, a single function is used to allocate the rings
themselves, initialize them, populate the rx ring, and
allocate the rx buffers. The same happens on free.

This splits them into separate functions. This will be
useful when properly implementing re-initialization on
link changes and error handling when the rings will be
repopulated but not freed.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 12:38:04 -07:00
Benjamin Herrenschmidt 5176477735 ftgmac100: Cleanup speed/duplex tracking and fix duplex config
Keep track of both the current speed and duplex settings
instead of only speed and properly apply the duplex setting
to the HW.

This reworks the adjust_link() function to also avoid trying
to reconfigure the HW when there is no link and to display
the link state to the user.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 12:38:04 -07:00
Benjamin Herrenschmidt 8396e1cb0b ftgmac100: Remove "enabled" flags
It's not used in any meaningful way

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 12:38:04 -07:00
Benjamin Herrenschmidt 831fb3388c ftgmac100: Reorder struct fields and comment
Reorder the fields in struct ftgmac in slightly more logical
groups. Will make more sense as I add/remove some.

No code change.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 12:38:04 -07:00
Benjamin Herrenschmidt 3f6af0eede ftgmac100: Remove "banner" comments
The divisions they represent are not particularily meaningful
and things are going to be moving around with upcoming changes
making these comments more a burden than anything else.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 12:38:04 -07:00
Benjamin Herrenschmidt 60b28a1167 ftgmac100: Use netdev->irq instead of private copy
There's a placeholder already for the irq, use it

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 12:38:04 -07:00
Felix Manlunas bb54be589c liquidio: fix Octeon core watchdog timeout false alarm
Detection of watchdog timeout of Octeon cores is flawed and susceptible to
false alarms.  Refactor by removing the detection code, and in its place,
leverage existing code that monitors for an indication from the NIC
firmware that an Octeon core crashed; expand the meaning of the indication
to "an Octeon core crashed or its watchdog timer expired".  Detection of
watchdog timeout is now delegated to an exception handler in the NIC
firmware; this is free of false alarms.

Also if there's an Octeon core crash or watchdog timeout:
(1) Disable VF Ethernet links.
(2) Decrement the module refcount by an amount equal to the number of
    active VFs of the NIC whose Octeon core crashed or had a watchdog
    timeout.  The refcount will continue to reflect the active VFs of
    other liquidio NIC(s) (if present) whose Octeon cores are faultless.

Item (2) is needed to avoid the case of not being able to unload the driver
because the module refcount is stuck at some non-zero number.  There is
code that, in normal cases, decrements the refcount upon receiving a
message from the firmware that a VF driver was unloaded.  But in
exceptional cases like an Octeon core crash or watchdog timeout, arrival of
that particular message from the firmware might be unreliable.  That normal
case code is changed to not touch the refcount in the exceptional case to
avoid contention (over the refcount) with the liquidio_watchdog kernel
thread who will carry out item (2).

Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 12:31:56 -07:00
David S. Miller 6f14f443d3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Mostly simple cases of overlapping changes (adding code nearby,
a function whose name changes, for example).

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 08:24:51 -07:00
Ngai-Mint Kwan 7d4fe0d123 fm10k: do not enqueue mailbox when host not ready
Interfaces will reset whenever the TX mailbox FIFO has become full. This
occurs more frequently whenever the IES API application is not running
to process and clear the messages in the FIFO. Thus, this could lead to
situations where the interface would enter an infinite reset loop. That
is: if the interface is trying to synchronize a huge number of unicast
and multicast entries with the IES API application, the TX mailbox FIFO
will become full and the interface resets. Once the interface exits
reset, it'll try to synchronize the unicast and multicast entries again.
Ergo, this creates an infinite loop. Other actions such as multiple
mulitcast mode or up/down transitions will fill the TX mailbox FIFO and
induce the interface to reset. To correct these situations, check if the
interface's "host_ready" flag is enabled before enqueuing any messages
to the TX mailbox FIFO. This check will be conducted by a function call.
Lastly, this issue mainly affects the PF and, thus, the VF is exempt.

Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-04-05 22:47:31 -07:00
Ngai-Mint Kwan 16b1889f8b fm10k: disable receive queue when configuring ring
Write to RXQCTL register to disable the receive queue when configuring
the RX ring.

Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-04-05 22:47:31 -07:00
Jacob Keller 02957703ca fm10k: update function header comment for fm10k_get_stats64
Re-word the comment to avoid stating that we return a value for this
void function. Additionally, there is no need to mention older kernels,
since this is the upstream kernel.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-04-05 22:47:31 -07:00
Jacob Keller b4fd8ffc11 fm10k: allow service task to reschedule itself
If some code path executes fm10k_service_event_schedule(), it is
guaranteed that we only queue the service task once, since we use
__FM10K_SERVICE_SCHED flag. Unfortunately this has a side effect that if
a service request occurs while we are currently running the watchdog, it
is possible that we will fail to notice the request and ignore it until
the next time the request occurs.

This can cause problems with pf/vf mailbox communication and other
service event tasks. To avoid this, introduce a FM10K_SERVICE_REQUEST
bit. When we successfully schedule (and set the _SCHED bit) the service
task, we will clear this bit. However, if we are unable to currently
schedule the service event, we just set the new SERVICE_REQUEST bit.

Finally, after the service event completes, we will re-schedule if the
request bit has been set.

This should ensure that we do not miss any service event schedules,
since we will re-schedule it once the currently running task finishes.
This means that for each request, we will always schedule the service
task to run at least once in full after the request came in.

This will avoid timing issues that can occur with the service event
scheduling. We do pay a cost in re-running many tasks, but all the
service event tasks use either flags to avoid duplicate work, or are
tolerant of being run multiple times.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-04-05 22:47:30 -07:00
Jacob Keller 4692955787 fm10k: future-proof state bitmaps using DECLARE_BITMAP
This ensures that future programmers do not have to remember to re-size
the bitmaps due to adding new values. Although this is unlikely for this
driver, it may happen and it's best to prevent it from ever being an
issue.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-04-05 22:47:30 -07:00
Jacob Keller 3ee7b3a3b9 fm10k: use a BITMAP for flags to avoid race conditions
Replace bitwise operators and #defines with a BITMAP and enumeration
values. This is similar to how we handle the "state" values as well.

This has two distinct advantages over the old method. First, we ensure
correctness of operations which are currently problematic due to race
conditions. Suppose that two kernel threads are running, such as the
watchdog and an ethtool ioctl, and both modify flags. We'll say that the
watchdog is CPU A, and the ethtool ioctl is CPU B.

CPU A sets FLAG_1, which can be seen as
  CPU A read FLAGS
  CPU A write FLAGS | FLAG_1

CPU B sets FLAG_2, which can be seen as
  CPU B read FLAGS
  CPU A write FLAGS | FLAG_2

However, "|=" and "&=" operators are not actually atomic. So this could
be ordered like the following:

CPU A read FLAGS -> variable
CPU B read FLAGS -> variable
CPU A write FLAGS (variable | FLAG_1)
CPU B write FLAGS (variable | FLAG_2)

Notice how the 2nd write from CPU B could actually undo the write from
CPU A because it isn't guaranteed that the |= operation is atomic.

In practice the race windows for most flag writes is incredibly narrow
so it is not easy to isolate issues. However, the more flags we have,
the more likely they will cause problems. Additionally, if such
a problem were to arise, it would be incredibly difficult to track down.

Second, there is an additional advantage beyond code correctness. We can
now automatically size the BITMAP if more flags were added, so that we
do not need to remember that flags is u32 and thus if we added too many
flags we would over-run the variable. This is not a likely occurrence
for fm10k driver, but this patch can serve as an example for other
drivers which have many more flags.

This particular change does have a bit of trouble converting some of the
idioms previously used with the #defines for flags. Specifically, when
converting FM10K_FLAG_RSS_FIELD_IPV[46]_UDP flags. This whole operation
was actually quite problematic, because we actually stored flags
separately. This could more easily show the problem of the above
re-ordering issue.

This is really difficult to test whether atomics make a difference in
practical scenarios, but you can ensure that basic functionality remains
the same. This patch has a lot of code coverage, but most of it is
relatively simple.

While we are modifying these files, update their copyright year.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-04-05 22:47:30 -07:00
Phil Turnbull 540fca35e3 fm10k: correctly check if interface is removed
FM10K_REMOVED expects a hardware address, not a 'struct fm10k_hw'.

Fixes: 5cb8db4a4c ("fm10k: Add support for VF")
Signed-off-by: Phil Turnbull <phil.turnbull@oracle.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-04-05 22:47:30 -07:00
Jakub Kicinski c383bdd14f nfp: fix potential use after free on xdp prog
We should unregister the net_device first, before we give back
our reference on xdp_prog.  Otherwise xdp_prog may be freed
before .ndo_stop() disabled the datapath.  Found by code inspection.

Fixes: ecd63a0217 ("nfp: add XDP support in the driver")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 18:46:40 -07:00
Edward Cree 148cbab6cf sfc: don't insert mc_list on low-latency firmware if it's too long
If the mc_list is longer than 256 addresses, we enter mc_promisc mode.
If we're in mc_promisc mode and the firmware doesn't support cascaded
 multicast, normally we also insert our mc_list, to prevent stealing by
 another VI.  However, if the mc_list was too long, this isn't really
 helpful - the MC groups that didn't fit in the list can still get
 stolen, and having only some of them stealable will probably cause
 more confusing behaviour than having them all stealable.  Since
 inserting 256 multicast filters takes a long time and can lead to MCDI
 state machine timeouts, just skip the mc_list insert in this overflow
 condition.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 18:35:21 -07:00
Jakub Kicinski 7c69873727 nfp: add support for .set_link_ksettings()
Support setting link speed and autonegotiation through
set_link_ksettings() ethtool op.  If the port is reconfigured
in incompatible way and reboot is required the netdev will get
unregistered and not come back until user reboots the system.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 10:49:12 -07:00
Jakub Kicinski 5a560832eb nfp: NSP backend for link configuration operations
Add NSP backend for upcoming link configuration operations.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 10:49:12 -07:00
Jakub Kicinski 85eb97dd2f nfp: add extended error messages
Allow NSP to set option code even when error is reported.  This provides
a way for NSP to give user more precise information about why command
failed.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 10:49:12 -07:00
Jakub Kicinski e890ae8e49 nfp: turn NSP port entry into a union
Make NSP port structure a union to simplify accessing the fields
from generic macros.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 10:49:12 -07:00
Jakub Kicinski 30a029217d nfp: allow multi-stage NSP configuration
NSP commands may be slow to respond, we should try to avoid doing
a command-per-item when user requested to change multiple parameters
for instance with an ethtool .set_settings() command.

Introduce a way of internal NSP code to carry state in NSP structure
and add start/finish calls to perform the initialization and kick off
of the configuration request, with potentially many parameters being
modified in between.

nfp_eth_set_mod_enable() will make use of the new code internally,
other "set" functions to follow.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 10:49:12 -07:00
Jakub Kicinski ce22f5a2cb nfp: separate high level and low level NSP headers
We will soon add more NSP commands and structure definitions.
Move all high-level NSP header contents to a common nfp_nsp.h file.
Right now it mostly boils down to renaming nfp_nsp_eth.h and
moving some functions from nfp.h there.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 10:49:12 -07:00
Jakub Kicinski 9f9e0da57e nfp: report port type in ethtool
Service process firmware provides us with information about media
and interface (SFP module) plugged in, translate that to Linux's
PORT_* defines and report via ethtool.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 10:49:12 -07:00
Jakub Kicinski 42b1e6aa46 nfp: report auto-negotiation in ethtool
NSP ABI version 0.17 is exposing the autonegotiation settings.
Report whether autoneg is on via ethtool.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 10:49:12 -07:00
Jakub Kicinski 21d529d5eb nfp: report link speed from NSP
On the PF prefer the link speed value provided by the NSP.
Refresh port table if needed.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 10:49:12 -07:00
Jakub Kicinski 172f638c93 nfp: add port state refresh
We will need a way of refreshing port state for link settings
get/set.  For get we need to refresh port speed and type.

When settings are changed the reconfiguration may require
reboot before it's effective.  Unregister netdevs affected
by reconfiguration from a workqueue.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 10:49:12 -07:00
Jakub Kicinski cee4295133 nfp: track link state changes
For caching link settings - remember if we have seen link events
since the last time the eth_port information was refreshed.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 10:49:12 -07:00
Jakub Kicinski d12537df34 nfp: add mutex protection for the port list
We will want to unregister netdevs after their port got reconfigured.
For that we need to make sure manipulations of port list from the
port reconfiguration flow will not race with driver's .remove()
callback.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 10:49:12 -07:00
Jakub Kicinski b9de00770d nfp: don't spawn netdevs for reconfigured ports
After port reconfiguration (port split, media type change)
firmware will continue to report old configuration until
reboot.  NSP will inform us that reconfiguration is pending.
To avoid user confusion refuse to spawn netdevs until the
new configuration is applied (reboot).

We need to split the netdev to eth_table port matching from
MAC search and move it earlier in the probe() flow.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 10:49:12 -07:00
Jakub Kicinski 265aeb511b nfp: add support for .get_link_ksettings()
Read link speed from the BAR.  This provides very basic information
and works for both PFs and VFs.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 10:49:12 -07:00
Eric Biggers 5e35141066 net: ibm: emac: remove unused sysrq handler for 'c' key
Since commit d6580a9f15 ("kexec: sysrq: simplify sysrq-c handler"),
the sysrq handler for the 'c' key has been sysrq_crash_op.  Debugging
code in the ibm_emac driver also tries to register a handler for the 'c'
key, but this has no effect because register_sysrq_key() doesn't replace
existing handlers.  Since evidently no one has cared enough to fix this
in the last 8 years, and it's very rare for drivers to register sysrq
handlers (for good reason), just remove the dead code.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 07:26:18 -07:00
Dan Carpenter c800460086 qed: Add a missing error code
We should be returning -ENOMEM if qed_mcp_cmd_add_elem() fails.  The
current code returns success.

Fixes: 4ed1eea82a ("qed: Revise MFW command locking")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 06:53:42 -07:00
Dan Carpenter 781159fb9c liquidio: clear the correct memory
There is a cut and paste bug here so we accidentally clear the first
few bytes of "resp" a second time instead clearing "ctx".

Fixes: 50c0add534 ("liquidio: refactor interrupt moderation code")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 06:53:42 -07:00
Joao Pinto 03cf65a959 net: stmmac: rx queue to dma channel mapping fix
In hardware configurations where multiple queues are active,
the rx queue needs to be mapped into a dma channel, even if
a single rx queue is used.

Signed-off-by: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 06:47:26 -07:00
Michael Chan 68a946bb81 bnxt_en: Cap the msix vector with the max completion rings.
The current code enables up to the maximum MSIX vectors in the PCIE
config space without considering the max completion rings available.
An MSIX vector is only useful when it has an associated completion
ring, so it is better to cap it.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 06:24:26 -07:00
Michael Chan 932dbf83ba bnxt_en: Use short TX BDs for the XDP TX ring.
No offload is performed on the XDP_TX ring so we can use the short TX
BDs.  This has the effect of doubling the size of the XDP TX ring so
that it now matches the size of the rx ring by default.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 06:24:26 -07:00
Michael Chan 67fea463fd bnxt_en: Add interrupt test to ethtool -t selftest.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 06:24:26 -07:00
Michael Chan 91725d89b9 bnxt_en: Add PHY loopback to ethtool self-test.
It is necessary to disable autoneg before enabling PHY loopback,
otherwise link won't come up.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 06:24:26 -07:00
Michael Chan f7dc1ea6c4 bnxt_en: Add ethtool mac loopback self test.
The mac loopback self test operates in polling mode.  To support that,
we need to add functions to open and close the NIC half way.  The half
open mode allows the rings to operate without IRQ and NAPI.  We
use the XDP transmit function to send the loopback packet.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 06:24:26 -07:00
Michael Chan eb51365846 bnxt_en: Add basic ethtool -t selftest support.
Add the basic infrastructure and only firmware tests initially.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 06:24:26 -07:00
Michael Chan f65a2044a8 bnxt_en: Add suspend/resume callbacks.
Add suspend/resume callbacks using the newer dev_pm_ops method.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 06:24:26 -07:00
Michael Chan 5282db6c79 bnxt_en: Add ethtool set_wol method.
And add functions to set and free magic packet filter.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 06:24:26 -07:00
Michael Chan 8e202366dd bnxt_en: Add ethtool get_wol method.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 06:24:26 -07:00
Michael Chan d196ece740 bnxt_en: Add pci shutdown method.
Add pci shutdown method to put device in the proper WoL and power state.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 06:24:26 -07:00
Michael Chan c1ef146a5b bnxt_en: Add basic WoL infrastructure.
Add code to driver probe function to check if the device is WoL capable
and if Magic packet WoL filter is currently set.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 06:24:26 -07:00
Michael Chan 8eb992e876 bnxt_en: Update firmware interface spec to 1.7.6.2.
Features added include WoL and selftest.

Signed-off-by: Deepak Khungar <deepak.khungar@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 06:24:26 -07:00
Sekhar Nori 30c57f0734 net: ethernet: ti: cpsw: fix race condition during open()
TI's cpsw driver handles both OF and non-OF case for phy
connect. Unfortunately of_phy_connect() returns NULL on
error while phy_connect() returns ERR_PTR().

To handle this, cpsw_slave_open() overrides the return value
from phy_connect() to make it NULL or error.

This leaves a small window, where cpsw_adjust_link() may be
invoked for a slave while slave->phy pointer is temporarily
set to -ENODEV (or some other error) before it is finally set
to NULL.

_cpsw_adjust_link() only handles the NULL case, and an oops
results when ERR_PTR() is seen by it.

Note that cpsw_adjust_link() checks PHY status for each
slave whenever it is invoked. It can so happen that even
though phy_connect() for a given slave returns error,
_cpsw_adjust_link() is still called for that slave because
the link status of another slave changed.

Fix this by using a temporary pointer to store return value
of {of_}phy_connect() and do a one-time write to slave->phy.

Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reported-by: Yan Liu <yan-liu@ti.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-04 10:33:33 -07:00
Colin Ian King a8919661d7 bnx2x: fix spelling mistake in macros HW_INTERRUT_ASSERT_SET_*
Trival fix, rename HW_INTERRUT_ASSERT_SET_* to HW_INTERRUPT_ASSERT_SET_*

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-04 10:04:49 -07:00
Ram Amrani f9dc4d1f0d qed: Manage with less memory regions for RoCE
It's possible some configurations would prevent driver from utilizing
all the Memory Regions due to a lack of ILT lines.
In such a case, calculate how many memory regions would have to be
dropped due to limit, and manage without those.

Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 19:16:37 -07:00
Mintz, Yuval 5f8cb033f4 qed: RoCE doesn't need to use SRC
As RoCE doesn't need to use the SRC, allocating ILT memory
on behalf of RoCE is wasting available ILT lines.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 19:16:37 -07:00
Mintz, Yuval 70566b42e9 qed: Correct TM ILT lines in presence of VFs
As of today there's no protocol supported that requires
support from the TM hardware block and enables SRIOV,
but we should still correct the calculation to reflect
the lines required for such future VFs instead of changing
the PF's own lines.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 19:16:37 -07:00
Michal Kalderon 44531ba45d qed: Fix TM block ILT allocation
When configuring the HW timers block we should set the number of CIDs
up until the last CID that require timers, instead of only those CIDs
whose protocol needs timers support.

Today, the protocols that require HW timers' support have their CIDs
before any other protocol, but that would change in future [when we
add iWARP support].

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 19:16:37 -07:00
Ariel Elior b5a9ee7cf3 qed: Revise QM cofiguration
Refactor and clean up the queue manager initialization logic.
Also, this adds support for RoC low latency queues, which later
would be used for improving RoCE latency in high throughput scenarios.

Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 19:16:37 -07:00
Andrew Lunn d39004ab13 net/faraday: Add missing include of of.h
Breaking the include loop netdevice.h, dsa.h, devlink.h broke this
driver, it depends on includes brought in by these headers. Adding
linux/of.h fixes it.

Fixes: ed0e39e97d34 ("net: break include loop netdevice.h, dsa.h, devlink.h")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 18:58:08 -07:00
Salil b4957ab082 net: hns: Some checkpatch.pl script & warning fixes
This patch fixes some checkpatch.pl script caught errors and
warnings during the compilation time.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 14:48:43 -07:00
lipeng 820c90cb3e net: hns: Avoid Hip06 chip TX packet line bug
There is a bug on Hip06 that tx ring interrupts packets count will be
clear when drivers send data to tx ring, so that the tx packets count
will never upgrade to packets line, and cause the interrupts engendered
was delayed.
Sometimes, it will cause sending performance lower than expected.

To fix this bug, we set tx ring interrupts packets line to 1 forever,
to avoid count clear. And set the gap time to 20us, to solve the problem
that too many interrupts engendered when packets line is 1.

This patch could advance the send performance on ARM  from 6.6G to 9.37G
when an iperf send thread on ARM and an iperf send thread on X86 for XGE.

Signed-off-by: lipeng <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 14:48:43 -07:00
Kejian Yan 76b588bc52 net: hns: Adjust the SBM module buffer threshold
HNS needs SMB Buffers to store at least two packets after sending
pause frame because of the link delay. The MTU of HNS is 9728. As
the processor user manual described, the SBM buffer threshold should
be modified.

Reported-by: Ping Zhang <zhangping5@huawei.com>
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Reviewed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 14:48:43 -07:00
Kejian Yan a2185587ad net: hns: Simplify the exception sequence in hns_ppe_init()
We need to free all ppe submodule if it fails to initialize ppe by
any fault, so this patch will free all ppe resource before
hns_ppe_init() returns exception situation

Reported-by: JinchuanTian <tianjinchuan1@huawei.com>
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Reviewed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 14:48:43 -07:00
Kejian Yan d592a4a4b9 net: hns: Optimise the code in hns_mdio_wait_ready()
This patch fixes the code to clear pclint warning/info.

Reported-by: Ping Zhang <zhangping5@huawei.com>
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Reviewed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 14:48:43 -07:00
Kejian Yan 6961acfa5c net: hns: Clean redundant code from hns_mdio.c file
This patch cleans the redundant code from  hns_mdio.c.

Reported-by: Ping Zhang <zhangping5@huawei.com>
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Reviewed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 14:48:43 -07:00
Kejian Yan 9f1607b8b5 net: hns: Remove redundant mac table operations
This patch removes redundant functions used only for debugging
purposes.

Reported-by: Weiwei Deng <dengweiwei@huawei.com>
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Reviewed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 14:48:43 -07:00
Kejian Yan 20f0d4f736 net: hns: Remove redundant mac_get_id()
There is a mac_id in mac control block structure, so the callback
function mac_get_id() is useless. Here we remove this function.

Reported-by: Weiwei Deng <dengweiwei@huawei.com>
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Reviewed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 14:48:43 -07:00
Kejian Yan 040a3800aa net: hns: Remove the redundant adding and deleting mac function
The functions (hns_dsaf_set_mac_mc_entry() and hns_mac_del_mac()) are
not called by any functions. They are dead code in hns. And the same
features are implemented by the patch (the id is 66355f5).

Reported-by: Weiwei Deng <dengweiwei@huawei.com>
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Reviewed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 14:48:43 -07:00
lipeng 64ec10dc2a net: hns: Correct HNS RSS key set function
This patch fixes below ethtool configuration error:

localhost:~ # ethtool -X eth0 hkey XX:XX:XX...
Cannot set Rx flow hash configuration: Operation not supported

Signed-off-by: lipeng <lipeng321@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 14:48:43 -07:00
lipeng f2aaed557e net: hns: Replace netif_tx_lock to ring spin lock
netif_tx_lock is a global spin lock, it will take affect
in all rings in the netdevice. In tx_poll_one process, it can
only lock the current ring, in this case, we define a spin lock
in hnae_ring struct for it.

Signed-off-by: lipeng <lipeng321@huawei.com>
reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 14:48:43 -07:00
lipeng b29bd41259 net: hns: Fix to adjust buf_size of ring according to mtu
Because buf_size of ring set to 2048, the process of rx_poll_one
can reuse the page, therefore the performance of XGE can improve.
But the chip only supports three bds in one package, so the max mtu
is 6K when it sets to 2048. For better performane in litter mtu, we
need change buf_size according to mtu.

When user change mtu, hns is only change the desc in memory. There
are some desc has been fetched by the chip, these desc can not be
changed by the code. So it needs set the port loopback and send
some packages to let the chip consumes the wrong desc and fetch new
desc.
Because the Pv660 do not support rss indirection, we need add version
check in mtu change process.

Signed-off-by: lipeng <lipeng321@huawei.com>
reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 14:48:43 -07:00
lipeng 36eedfde1a net: hns: Optimize hns_nic_common_poll for better performance
After polling less than buget packages, we need check again. If
there are still some packages, we call napi_schedule add softirq
queue, this is not better way. So we return buget value instead
of napi_schedule.

Signed-off-by: lipeng <lipeng321@huawei.com>
reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 14:48:43 -07:00
Daode Huang 4b7cdecaa4 net: hns: bug fix of ethtool show the speed
When run ethtool ethX on hns driver, the speed will show
as "Unknown". The base.speed is not correct assigned,
this patch fix this bug.

Signed-off-by: Daode Huang <huangdaode@hisilicon.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 14:48:43 -07:00
lipeng fb0672d116 net: hns: Remove redundant memset during buffer release
Because all members of desc_cb is assigned when xmit one package, so it
can delete in hnae_free_buffer, as follows:
        - "dma, priv, length, type" are assigned in fill_v2_desc.
        - "page_offset, reuse_flag, buf" are not used in tx direction.

Signed-off-by: lipeng <lipeng321@huawei.com>
Signed-off-by: Weiwei Deng <dengweiwei@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 14:48:43 -07:00
lipeng de99208cc7 net: hns: Optimize the code for GMAC pad and crc Config
This patch optimises the init configuration code leg
for gmac pad and crc set interface.

Signed-off-by: lipeng <lipeng321@huawei.com>
Signed-off-by: JinchuanTian <tianjinchuan1@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 14:48:43 -07:00
lipeng 87ff7e1f46 net: hns: Modify GMAC init TX threshold value
This patch reduces GMAC TX threshold value to avoid gmac
hang-up with speed 100M/duplex half.

Signed-off-by: lipeng <lipeng321@huawei.com>
Signed-off-by: JinchuanTian <tianjinchuan1@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 14:48:43 -07:00
lipeng ba2d079131 net: hns: Fix the implementation of irq affinity function
This patch fixes the implementation of the IRQ affinity
function. This function is used to create the cpu mask
which eventually is used to initialize the cpu<->queue
association for XPS(Transmit Packet Steering).

Signed-off-by: lipeng <lipeng321@huawei.com>
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-03 14:48:43 -07:00
Grygorii Strashko 75514b6654 net: ethernet: ti: cpsw: wake tx queues on ndo_tx_timeout
In case, if TX watchdog is fired some or all netdev TX queues will be
stopped and as part of recovery it is required not only to drain and
reinitailize CPSW TX channeles, but also wake up stoppted TX queues what
doesn't happen now and netdevice will stop transmiting data until
reopenned.

Hence, add netif_tx_wake_all_queues() call in .ndo_tx_timeout() to complete
recovery and restore TX path.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-02 19:42:44 -07:00
Andrew Morton e270e96686 drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c: fix build with gcc-4.4.4
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c: In function 'mlx5e_set_rxfh':
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1067: error: unknown field 'rss' specified in initializer
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1067: warning: missing braces around initializer
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1067: warning: (near initialization for 'rrp.<anonymous>')
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1068: error: unknown field 'rss' specified in initializer
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1069: warning: excess elements in struct initializer
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1069: warning: (near initialization for 'rrp')

gcc-4.4.4 has issues with anonymous union initializers.  Work around this.

Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-02 19:32:57 -07:00
Andrew Morton 956327913c drivers/net/ethernet/mellanox/mlx5/core/en_main.c: fix build with gcc-4.4.4
drivers/net/ethernet/mellanox/mlx5/core/en_main.c: In function 'mlx5e_redirect_rqts':
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2210: error: unknown field 'rqn' specified in initializer
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2211: warning: missing braces around initializer
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2211: warning: (near initialization for 'direct_rrp.<anonymous>')
drivers/net/ethernet/mellanox/mlx5/core/en_main.c: In function 'mlx5e_redirect_rqts_to_channels':
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2227: error: unknown field 'rss' specified in initializer
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2227: warning: missing braces around initializer
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2227: warning: (near initialization for 'rrp.<anonymous>')
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2227: warning: initialization makes integer from pointer without a cast
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2228: error: unknown field 'rss' specified in initializer
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2229: warning: excess elements in struct initializer
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2229: warning: (near initialization for 'rrp')
drivers/net/ethernet/mellanox/mlx5/core/en_main.c: In function 'mlx5e_redirect_rqts_to_drop':
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2238: error: unknown field 'rqn' specified in initializer
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2239: warning: missing braces around initializer
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2239: warning: (near initialization for 'drop_rrp.<anonymous>')

gcc-4.4.4 has issues with anonymous union initializers.  Work around this.

Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-02 19:32:57 -07:00
Joao Pinto 44781fef13 net: stmmac: fix cbs configuration
Sending again, because forgot to include net-dev.

The QoS IP does not accept AVB capabilities to default/queue 0, this way we
guarantee 75% bandwidth for AVB. This patch assures that only queues >= 1
gets CBS confgured. Additional info was also added to stmmac.txt.

Reported-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-02 19:30:06 -07:00
Mark Brown 3af887c38f net/faraday: Explicitly include linux/of.h and linux/property.h
This driver uses interfaces from linux/of.h and linux/property.h but
relies on implict inclusion of those headers which means that changes in
other headers could break the build, as happened in -next for arm today.
Add a explicit includes.

Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01 12:14:00 -07:00
Daode Huang b917078c1c net: hns: Add ACPI support to check SFP present
The current code only supports DT to check SFP present.
This patch adds ACPI support as well.

Signed-off-by: Daode Huang <huangdaode@hisilicon.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01 12:10:58 -07:00
Madalin Bucur 58b7bd0f4b dpaa_eth: use AVOIDBLOCK for Tx confirmation queues
The AVOIDBLOCK flag determines the Tx confirmation queues processing
to be redirected to any available CPU when the current one is slow
in processing them. This may result in a higher Tx confirmation
interrupt count but may reduce pressure on a certain CPU that with
the previous setting would process all Tx confirmation frames.

Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01 12:03:31 -07:00
Madalin Bucur b07e675b06 fsl/fman: take into account all RGMII modes
Accept the internal delay RGMII variants.

Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01 11:49:14 -07:00
Nathan Fontenot 1b8955ee5f ibmvnic: Cleanup failure path in ibmvnic_open
Now that ibmvnic_release_resources will clean up all of our resources
properly, even if they were not allocated, we can just call this
for failues in ibmvnic_open.

This patch also moves the ibmvnic_release_resources() routine up
in the file to avoid creating a forward declaration ad re-names it to
drop the ibmvnic prefix.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30 15:58:43 -07:00
Nathan Fontenot 7bbc27a496 ibmvnic: Create init/release routines for stats token
Create an initialization and a release routine for the stats token used by
the ibmvnic driver.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30 15:58:43 -07:00
Nathan Fontenot b510888f96 ibmvnic: Merge the two release_sub_crq_queue routines
Keeping two routines for releasing sub crqs, one for when irqs are not
initialized and one for when they are, is a bit of overkill. Merge the
two routines to a common release routine that will check for an irq
and release it if needed.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30 15:58:42 -07:00
Nathan Fontenot 0ffe2cb790 ibmvnic: Create init and release routines for the rx pool
Move the initialization and the release of the rx pool to their own
routines, and update them to do validation.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30 15:58:42 -07:00
Nathan Fontenot c657e32cd0 ibmvnic: Create init and release routines for the tx pool
Move the initialization and the release of the tx pool to their own routines,
and update them to do validation. This also adds validation to the release
of the long term buffer.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30 15:58:42 -07:00
Nathan Fontenot f0b8c96cbc ibmvnic: Create init and release routines for the bounce buffer
Move the handling of initialization and releasing the bounce buffer to their
own init and release routines.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30 15:58:42 -07:00
Nathan Fontenot f992887c34 ibmvnic: Update main crq initialization and release
Update the initialization and release routines for the crq queue so that
we validate the crq queue.

Additionally this updates the naming of the init and release routines
for the crq queue to drop the ibmvnic prefix. This matches the naming
for similar routines in the driver

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30 15:58:42 -07:00
Suresh Reddy 0b98ca2a45 be2net: Fix endian issue in logical link config command
Use cpu_to_le32() for link_config variable in set_logical_link_config
command as this variable is of type u32.

Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30 15:57:33 -07:00
Nathan Fontenot e704f0434e ibmvnic: Remove debugfs support
The debugfs support in the ibmvnic driver is not, and never has been,
supported. Just remove it.

The work done in the debugfs code for the driver was part of the original
spec for the ibmvnic driver. The corresponding support for this from the
server side was never supported and has been dropped.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30 12:40:59 -07:00
Florian Westphal 282ccf6efb drivers: add explicit interrupt.h includes
These files all use functions declared in interrupt.h, but currently rely
on implicit inclusion of this file (via netns/xfrm.h).

That won't work anymore when the flow cache is removed so include that
header where needed.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30 11:05:34 -07:00
Wadim Egorov eaf70ad14c net: stmmac: dwmac-rk: Add handling for RGMII_ID/RXID/TXID
ATM dwmac-rk will always set and enable it's internal delay lines.
Using PHY internal delays in combination with the phy-mode
rgmii-id/rxid/txid was not possible. Only rgmii was supported.

Now we can disable rockchip's gmac delay lines and also use
rgmii-id/rxid/txid.

Tested only with a RK3288 based board.

Signed-off-by: Wadim Egorov <w.egorov@phytec.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30 10:52:41 -07:00
LABBE Corentin 5bacd77849 Revert "net: stmmac: enable multiple buffers"
The commit aff3d9eff8 ("net: stmmac: enable multiple buffers") breaks
numerous boards. while some patch exists for fixing some of it,
dwmac-sunxi is still broken with it.
Since this patch is very huge, it will be better to split it in smaller
part.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30 10:50:26 -07:00
Zakharov Vlad 358e78b5f4 ezchip: nps_enet: check if napi has been completed
After a new NAPI_STATE_MISSED state was added to NAPI we can get into
this state and in such case we have to reschedule NAPI as some work is
still pending and we have to process it. napi_complete_done() function
returns false if we have to reschedule something (e.g. in case we were
in MISSED state) as current polling have not been completed yet.

nps_enet driver hasn't been verifying the return value of
napi_complete_done() and has been forcibly enabling interrupts. That is
not correct as we should not enable interrupts before we have processed
all scheduled work. As a result we were getting trapped in interrupt
hanlder chain as we had never been able to disabale ethernet
interrupts again.

So this patch makes nps_enet_poll() func verify return value of
napi_complete_done() and enable interrupts only in case all scheduled
work has been completed.

Signed-off-by: Vlad Zakharov <vzakhar@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-29 14:28:16 -07:00
David S. Miller 397df7092a Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2017-03-29

This series contains updates to i40e and i40evf only.

Preethi changes the default driver mode of operation to descriptor
write-back for VF.

Alex cleans up and addresses several issues in the way that i40e handles
private flags.  Modifies the driver to use the length of the packet
instead of the DD status bit to determine if a new descriptor is ready
to be processed.  Refactors the driver by pulling the code responsible
for fetching the receive buffer and synchronizing DMA into a single
function.  Also pulled the code responsible for handling buffer
recycling and page counting and distributed it through several functions,
so we can commonize the bits that handle either freeing or recycling the
buffers.  Cleans up the code in preparation for us adding support for
build_skb().  Changed the way we handle the maximum frame size for the
receive path so it is more consistent with other drivers.

Paul enables XL722 to use the direct read/write method since it does not
support the AQ command to read/write the control register.

Christopher fixes a case where we miss an arq element if a new one is
added before we enable interrupts and exit the loop.

Jake cleans up a pointless goto statement.  Also cleaned up a flag that
was not being used.

Carolyn does round 2 for adding a delay to the receive queue to
accommodate the hardware needs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-29 14:13:09 -07:00
Jisheng Zhang d6956ac87b net: mvneta: set rx mode during resume if interface is running
I found a bug by:

0. boot and start dhcp client
1. echo mem > /sys/power/state
2. resume back immediately
3. don't touch dhcp client to renew the lease
4. ping the gateway. No acks

Usually, after step2, the DHCP lease isn't expired, so in theory we
should resume all back. But in fact, it doesn't. It turns out
the rx mode isn't resumed correctly. This patch fixes it by adding
mvneta_set_rx_mode(dev) in the resume hook if interface is running.

Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-29 14:08:57 -07:00
Jisheng Zhang a38d20d791 net: mvneta: add RGMII_RXID and RGMII_TXID support
RGMII_RXID and RGMII_TX_ID share the same GMAC CTRL setting as RGMII
or RGMII_ID.

Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-29 14:08:16 -07:00
Michael Chan 3ed3a83e3f bnxt_en: Fix DMA unmapping of the RX buffers in XDP mode during shutdown.
In bnxt_free_rx_skbs(), which is called to free up all RX buffers during
shutdown, we need to unmap the page if we are running in XDP mode.

Fixes: c61fb99cae ("bnxt_en: Add RX page mode support.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-29 14:05:34 -07:00
Sankar Patchineelam 23e12c8934 bnxt_en: Correct the order of arguments to netdev_err() in bnxt_set_tpa()
Signed-off-by: Sankar Patchineelam <sankar.patchineelam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-29 14:05:33 -07:00
Sankar Patchineelam 2247925f09 bnxt_en: Fix NULL pointer dereference in reopen failure path
Net device reset can fail when the h/w or f/w is in a bad state.
Subsequent netdevice open fails in bnxt_hwrm_stat_ctx_alloc().
The cleanup invokes bnxt_hwrm_resource_free() which inturn
calls bnxt_disable_int().  In this routine, the code segment

if (ring->fw_ring_id != INVALID_HW_RING_ID)
   BNXT_CP_DB(cpr->cp_doorbell, cpr->cp_raw_cons);

results in NULL pointer dereference as cpr->cp_doorbell is not yet
initialized, and fw_ring_id is zero.

The fix is to initialize cpr fw_ring_id to INVALID_HW_RING_ID before
bnxt_init_chip() is invoked.

Signed-off-by: Sankar Patchineelam <sankar.patchineelam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-29 14:05:33 -07:00
David S. Miller 6c2257062e mlx5e-pedit 2017-03-28
Or Gerlitz says:
 
 This series adds support for offloading modifications of packet headers using
 ConnectX-5 HW header re-write as an action applied during packet steering.
 
 The offloaded SW mechanism is TC's pedit action. The offloading is
 supported for E-Switch steering of VF traffic in the SRIOV
 switchdev mode and for NIC (non eswitch) RX.
 
 One use-case for this offload on virtual networks, is when the hypervisor
 implements flow based router such as Open-Stack's DVR, where L2 headers
 of guest packets re-written with routers' MAC addresses and the IP TTL
 is decremented.
 
 Another use case (which can be applied in parallel with routing) is
 stateless NAT where guest L3/L4 headers are re-written.
 
 The series is built as follows: the 1st six patches are preperations which
 don't yet add new functionality, patches 7-8 add the FW APIs (data-structures
 and commands) for header re-write, and patch nine allows offloading driver
 to access pedit keys.
 
 The 10th patch is somehow the core of the series, where we translate from
 the pedit way to represent set of header modification elements to the FW
 API for that same matter.
 
 Once a set of HW modification is established, we register it with the FW
 and get a modify header ID. When this ID is used with an action during
 packet steering, the HW applies the header modification on the packet.
 
 Patches 11 and 12 implement the above logic as an offload for pedit action
 for the NIC and E-Switch use-cases.
 
 I'd like to thanks Elijah Shakkour <elijahs@mellanox.com> for implementing
 and helping me testing this functionality on HW simulator, before it could
 be done with FW.
 
 - Or.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJY2lwEAAoJEEg/ir3gV/o+rFIH+wdwGawEjoDhpihLqJHoRtwo
 Wvy88Lczj++Pfzt9E0kgwgmOdnj7j+GVOh6ALjneE3PDBJEFWG/GWY5aRYonlhhf
 zibafMTYf+8Dmm9qHW/C4OvhQowSrkG1RDucM2eyjXJfnAShZCh7dV4CDD7paxhu
 N2rlDdSEl0Im4aPCNHzyrdGg06Fy3A0DQkDvVLIQhKV0cLPIoC0U/i+ymVtsCUY/
 sSEEuSohvwdD5Ga5ZZdKicCo61lIRSi2rX5v4sK0exhAO3S8xyrKnwbiN7nVAQqg
 eVZ/ekbBiksD8MRMKctt/zGxd0X4PDaQ8J9XyF9CL6pRC5VipsDy+P/GEhj/x8U=
 =l2Qo
 -----END PGP SIGNATURE-----

Merge tag 'mlx5e-pedit' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Or Gerlitz says:

====================
mlx5e-pedit 2017-03-28

This series adds support for offloading modifications of packet headers using
ConnectX-5 HW header re-write as an action applied during packet steering.

The offloaded SW mechanism is TC's pedit action. The offloading is
supported for E-Switch steering of VF traffic in the SRIOV
switchdev mode and for NIC (non eswitch) RX.

One use-case for this offload on virtual networks, is when the hypervisor
implements flow based router such as Open-Stack's DVR, where L2 headers
of guest packets re-written with routers' MAC addresses and the IP TTL
is decremented.

Another use case (which can be applied in parallel with routing) is
stateless NAT where guest L3/L4 headers are re-written.

The series is built as follows: the 1st six patches are preperations which
don't yet add new functionality, patches 7-8 add the FW APIs (data-structures
and commands) for header re-write, and patch nine allows offloading driver
to access pedit keys.

The 10th patch is somehow the core of the series, where we translate from
the pedit way to represent set of header modification elements to the FW
API for that same matter.

Once a set of HW modification is established, we register it with the FW
and get a modify header ID. When this ID is used with an action during
packet steering, the HW applies the header modification on the packet.

Patches 11 and 12 implement the above logic as an offload for pedit action
for the NIC and E-Switch use-cases.

I'd like to thanks Elijah Shakkour <elijahs@mellanox.com> for implementing
and helping me testing this functionality on HW simulator, before it could
be done with FW.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-29 11:24:21 -07:00
Wyborny, Carolyn d08a9f6cd1 i40e: fix for queue timing delays
This patch adds a delay to Rx queue disables to accommodate HW needs.

v2: Added missing check for disable only, additional details on the
need for the ugly delay and fixed spacing on comment.

Change-ID: I2864ca667ce5dcc2cc44f8718113b719742a46a1
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-29 02:23:00 -07:00
Alexander Duyck dab86afdbb i40e/i40evf: Change the way we limit the maximum frame size for Rx
This patch changes the way we handle the maximum frame size for the Rx
path.  Previously we were rounding up to 2K for a 1500 MTU and then brining
the max frame size down to MTU plus a fixed amount.  With this patch
applied what we now do is limit the maximum frame to 1.5K minus the value
for NET_IP_ALIGN for standard MTU, and for any MTU greater than 1500 we
allow up to the maximum frame size.  This makes the behavior more
consistent with the other drivers such as igb which had similar logic.  In
addition it reduces the test matrix for MTU since we only have two max
frame sizes that are handled for Rx now.

Change-ID: I23a9d3c857e7df04b0ef28c64df63e659c013f3f
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-29 02:15:07 -07:00
Alexander Duyck c424d4a3dd i40e/i40evf: Add legacy-rx private flag to allow fallback to old Rx flow
This patch adds a control which will allow us to toggle into and out of the
legacy Rx mode.  The legacy Rx mode is what we currently do when performing
Rx.  As I make further changes what should happen is that the driver will
fall back to the behavior for Rx as of this patch should the "legacy-rx"
flag be set to on.

Change-ID: I0342998849bbb31351cce05f6e182c99174e7751
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-29 02:15:06 -07:00
Alexander Duyck fa2343e903 i40e/i40evf: Break i40e_fetch_rx_buffer up to allow for reuse of frag code
This patch is meant to clean up the code in preparation for us adding
support for build_skb.  Specifically we deconstruct i40e_fetch_buffer into
several functions so that those functions can later be reused when we add a
path for build_skb.

Specifically with this change we split out the code for adding a page to an
exiting skb.

Change-ID: Iab1efbab6b8b97cb60ab9fdd0be1d37a056a154d
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-29 02:15:06 -07:00
Alexander Duyck a0cfc3130e i40e/i40evf: Pull out code for cleaning up Rx buffers
This patch pulls out the code responsible for handling buffer recycling and
page counting and distributes it through several functions.  This allows us
to commonize the bits that handle either freeing or recycling the buffers.

As far as the page count tracking one change to the logic is that
pagecnt_bias is decremented as soon as we call i40e_get_rx_buffer.  It is
then the responsibility of the function that pulls the data to either
increment the pagecnt_bias if the buffer can be recycled as-is, or to
update page_offset so that we are pointing at the correct location for
placement of the next buffer.

Change-ID: Ibac576360cb7f0b1627f2a993d13c1a8a2bf60af
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-29 02:15:06 -07:00
Alexander Duyck 9a064128fc i40e/i40evf: Pull code for grabbing and syncing rx_buffer from fetch_buffer
This patch pulls the code responsible for fetching the Rx buffer and
synchronizing DMA into a function, specifically called i40e_get_rx_buffer.

The general idea is to allow for better code reuse by pulling this out of
i40e_fetch_rx_buffer.  We dropped a couple of prefetches since the time
between the prefetch being called and the data being accessed was too small
to be useful.

Change-ID: I4885fce4b2637dbedc8e16431169d23d3d7e79b9
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-29 02:15:06 -07:00
Alexander Duyck d57c0e08c7 i40e/i40evf: Use length to determine if descriptor is done
This change makes it so that we use the length of the packet instead of the
DD status bit to determine if a new descriptor is ready to be processed.
The obvious advantage is that it cuts down on reads as we don't really even
need the DD bit if going from a 0 to a non-zero value on size is enough to
inform us that the packet has been completed.

Change-ID: Iebdf9cdb36c454ef092df27199b92ad09c374231
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-29 02:15:06 -07:00
Jacob Keller 3a104f8df2 i40e: remove FDIR_REQUIRES_REINIT driver flag
This flag hasn't been used since commit 1e1be8f622 ("i40e: ATR policy
change to flush the table to clean stale ATR rules").

Lets simplify things and just remove it.

Change-ID: I76279d84db8a2fd96f445b96aa413059f9256879
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-29 02:15:06 -07:00
Jacob Keller d9eaf12e85 i40e: remove a useless goto statement
The goto found here for when in MFP mode is pointless. It jumps to the
end of a series of if blocks. However, right after this statement is
a closing '}' for this if block, which will result in the program flow
going to the exact same location as the goto statement indicates. Thus,
regardless of whether we are in MFP mode, the program flow will resume
from the same location.

This arose due to various refactoring which did not notice that this
goto became essentially a no-op.

To properly understand this diff you will need to view a larger context
than is given by default.

Change-ID: I088f73c3831aa5c4e2281380c7a3ce605594300c
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-29 02:15:06 -07:00
Christopher N Bednarz 1fca3265be i40e: Check for new arq elements before leaving the adminq subtask loop
Fix a case where we miss an arq element if a new one is added before we
enable interrupts and exit the arq subtask loop. This occurs frequently
with RDMA running on Windows VF and causes long delays that prevent SMB
from establishing connections.

Change-ID: I3e1c8b2b960c12857d9b8275bea2c1563674392e
Signed-off-by: Christopher N Bednarz <christopher.n.bednarz@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-29 02:15:06 -07:00
Paul M Stillwell Jr 6030308ef8 i40e: use register for XL722 control register read/write
The XL722 doesn't support the AQ command to read/write the control
register so enable it to bypass the check and use the direct read/write
method.

Change-ID: Iefecc737b57207485c90845af5989d5af518bf16
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-29 02:15:06 -07:00
Alexander Duyck aca955d831 i40e: Clean up handling of private flags
This patch cleans up and addresses several issues in the way that i40e
handles private flags. Previously the code was choosing fixed bits and
trying to match them up with strings in a somewhat haphazard way. This
resulted in the possibility for adding a new bit and causing a mismatch as
the private flags are linear bits starting at 0, and the private flags in
the driver were split up over a group specific to the PF and a group that
was global.

What this change does is define an array of structs used to represent the
private flags. Contained within the structs are the bits necessary to know
which flags to set and/or clear depending on the state of the bit. By
doing this we can add new bits in the future with minimal overhead and
avoid creating possible mis-matches should we need to remove a flag based
on compile options.

Change-ID: Ia3214ab04f0ab2f70354ac0997a135f1d01b0acd
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-29 02:15:06 -07:00
Preethi Banala b1cb07db6e i40evf: enforce descriptor write-back mechanism for VF
The current driver mode is to use a write-back mechanism for the head
register which indicates transmit completions. The VF driver needs to be
able to work on hardware that exclusively uses descriptor write-back, so
change the default driver mode of operation to descriptor write-back for
VF. In our analysis, performance wasn't significantly different with
either write-back method.

Change-ID: Ia92e4ec77c2df8dc4515c71d53746d57d77759af
Signed-off-by: Preethi Banala <preethi.banala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-29 02:15:06 -07:00
Andrew Lunn c6e970a04b net: break include loop netdevice.h, dsa.h, devlink.h
There is an include loop between netdevice.h, dsa.h, devlink.h because
of NETDEV_ALIGN, making it impossible to use devlink structures in
dsa.h.

Break this loop by taking dsa.h out of netdevice.h, add a forward
declaration of dsa_switch_tree and netdev_set_default_ethtool_ops()
function, which is what netdevice.h requires.

No longer having dsa.h in netdevice.h means the includes in dsa.h no
longer get included. This breaks a few other files which depend on
these includes. Add these directly in the affected file.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 22:46:04 -07:00
Prasad Kanneganti 50c0add534 liquidio: refactor interrupt moderation code
Refactor interrupt moderation code for flexibility because parameters are
different for 10G and 25G cards.  Currently parameters (for 10G only) come
from macros compiled-in to the PF and VF drivers; fix it so that parameters
suitable for the card (10G or 25G) come from the NIC firmware via response
to a command.

Also bump up driver version to 1.5.1 to match newer NIC firmware version.

Signed-off-by: Prasad Kanneganti <prasad.kanneganti@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 22:22:03 -07:00
Arnd Bergmann 16b8b6de32 rocker: fix Wmaybe-uninitialized false-positive
gcc-7 reports a warning that earlier versions did not have:

drivers/net/ethernet/rocker/rocker_ofdpa.c: In function 'ofdpa_port_stp_update':
arch/x86/include/asm/string_32.h:79:22: error: '*((void *)&prev_ctrls+4)' may be used uninitialized in this function [-Werror=maybe-uninitialized]
   *((short *)to + 2) = *((short *)from + 2);
   ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/rocker/rocker_ofdpa.c:2218:7: note: '*((void *)&prev_ctrls+4)' was declared here

This is clearly a variation of the warning about 'prev_state' that
was shut up using uninitialized_var().

We can slightly simplify the code and get rid of the warning by unconditionally
saving the prev_state and prev_ctrls variables. The inlined memcpy is not
particularly expensive here, as it just has to read five bytes from one or
two cache lines.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 21:42:32 -07:00
Talat Batheesh e497ec680c net/mlx5: Avoid dereferencing uninitialized pointer
In NETDEV_CHANGEUPPER event the upper_info field is valid
only when linking is true. Otherwise it should be ignored.

Fixes: 7907f23adc (net/mlx5: Implement RoCE LAG feature)
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Reviewed-by: Aviv Heller <avivh@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 18:07:15 -07:00
Mintz, Yuval d0d40a73f1 qed: Use BDQ resource for storage protocols
Until now, qed used some port-defined value as BDQ index for both iSCSI
and FCoE.

As management firmware now treats BDQ as a resource and tells each PF
its BDQ-range, start using a valure from that range instead.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 18:05:24 -07:00
Tomer Tayar 9c8517c40f qed: Utilize resource-lock based scheme
Management firmware is used as an arbiter between the various PFs
in matters of resources, but some of the resources that need to
be divided are dependent on the non-management firmware used,
so management firmware first needs to be told how many resources
there are before trying to divide them.

As part of the initialization sequence, driver would first inform
the management firmware of the available resources under
a dedicated resource lock, and afterwards request for various
resources which might be based on the previous set values.

Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 18:05:24 -07:00
Tomer Tayar 95691c9cea qed: Support management-based resource locking
Global locking can't properly be used to synchronize between different
PFs in all scenarios, as those instances might reside in different
logical partitions [e.g., when a PF is assigned via PDA to some VM].

The management firmware provides a generic infrastructure for
device locks. For each 'resource', it's guaranteed it could be acquired
by at most a single PF at any given time [or by management firmware].

This patch adds the necessary logic in qed for utilizing said
infrastructure, implementing lock/unlock internal APIs.

Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 18:05:23 -07:00
Mintz, Yuval 18a69e368b qed: Send pf-flr as part of initialization
During HW initialization, driver would set various registers to their
needed values - but it assumes all registers start at their reset-value,
so there's no need to re-configure a register's default value.

This assumption might be incorrect, e.g., in case of preboot driver
running and initializing the driver prior to our driver.

To overcome this, we now ask management firmware to initiate a PF-flr
early during the initialization sequence. That would return everything
in the PF's scope back to default and prevent previous configurations
from still being applied.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 18:05:23 -07:00
Tomer Tayar 5d24bcf189 qed: Move to new load request scheme
Management firmware is used as an arbiter between the various PFs
in regard to loading - it causes the various PFs to load/unload
sequentially and informs each of its appropriate rule in the init.

But the existing flow is too weak to handle some scenarios where
PFs aren't properly cleaned prior to loading.
The significant scenarios falling under this criteria:
  a. Preboot drivers in some environment can't properly unload.
  b. Unexpected driver replacement [kdump, PDA].

Modern management firmware supports a more intricate loading flow,
where the driver has the ability to overcome previous limitations.
This moves qed into using this newer scheme.

Notice new scheme is backward compatible, so new drivers would
still be able to load properly on top of older management firmwares
and vice versa.

Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 18:05:23 -07:00
Mintz, Yuval c0c2d0b49e qed: hw_init() to receive parameter-struct
We'll soon need additional information, so start by changing
the infrastructure to receive the initializing variables
via a parameter struct.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 18:05:23 -07:00
Tomer Tayar 1226337ad9 qed: Correct HW stop flow
Management firmware is used as arbiter between different PFs
which are loading/unloading, but in order to use the synchronization
it offers the contending configurations need to be applied either
between their LOAD_REQ <-> LOAD_DONE or UNLOAD_REQ <-> UNLOAD_DONE
management firmware commands.

Existing HW stop flow utilizes 2 different functions: qed_hw_stop() and
qed_hw_reset() which don't abide this requirement; Most of the closure
is doing outside the scope of the unload request.

This patch removes qed_hw_reset() and places the relevant stop
functionality underneath the management firmware protection.

Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 18:05:23 -07:00
Jonas Jensen c2b341a620 net: moxa: fix TX overrun memory leak
moxart_mac_start_xmit() doesn't care where tx_tail is, tx_head can
catch and pass tx_tail, which is bad because moxart_tx_finished()
isn't guaranteed to catch up on freeing resources from tx_tail.

Add a check in moxart_mac_start_xmit() stopping the queue at the
end of the circular buffer. Also add a check in moxart_tx_finished()
waking the queue if the buffer has TX_WAKE_THRESHOLD or more
free descriptors.

While we're at it, move spin_lock_irq() to happen before our
descriptor pointer is assigned in moxart_mac_start_xmit().

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=99451

Signed-off-by: Jonas Jensen <jonas.jensen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 18:02:05 -07:00
Arnd Bergmann 589a1a2e63 stmmac: use netif_set_real_num_{rx,tx}_queues
A driver must not access the two fields directly but should instead use
the helper functions to set the values and keep a consistent internal
state:

ethernet/stmicro/stmmac/stmmac_main.c: In function 'stmmac_dvr_probe':
ethernet/stmicro/stmmac/stmmac_main.c:4083:8: error: 'struct net_device' has no member named 'real_num_rx_queues'; did you mean 'real_num_tx_queues'?

Fixes: a8f5102af2 ("net: stmmac: TX and RX queue priority configuration")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 18:00:34 -07:00
Arkadi Sharshevsky 2ba5999f00 mlxsw: spectrum: Add Support for erif table entries access
Implement dpipe's table ops for erif table which provide:
1. Getting the entries in the table with the associate values.
	- match on "mlxsw_meta:erif_index"
	- action on "mlxsw_meta:forwared_out"
2. Synchronize the hardware in case of enabling/disabling counters which
   mean removing erif counters from all interfaces.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 17:11:55 -07:00
Arkadi Sharshevsky fd1b9d4192 mlxsw: spectrum_router: Add rif helper functions
Add rif helper function to access the rif index and rif devices ifindex.
This functions will be used by dpipe in order to dump the rif table.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 17:11:55 -07:00
Arkadi Sharshevsky e0c0afd8aa mlxsw: spectrum: Support for counters on router interfaces
Add support for counter allocation on router interfaces. The allocation
depends on the counter state of relevant table. In case the counting is
disabled or no counters left the counter index will be set as invalid.

Also a counter pool for router allocation is added.

Signed-off-by: Arakdi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 17:11:55 -07:00
Arkadi Sharshevsky ba73e97a63 mlxsw: reg: Add Router Interface Counter Register
The RICNT register retrieves per port performance counter. It will be
used to query the router interfaces statistics.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 17:11:55 -07:00
Arkadi Sharshevsky d54b70feb6 mlxsw: spectrum: Add definition for egress rif table
Add definition for egress router interface table. This table describes
the final part in the routing pipeline. This table matches the egress
interface index (rif index, which is set by the previous stages and
determine the out port) and makes the decision of forwarding the packet
towards the L2 logic or dropping it.

The metadata header is added to represent this internal information.
The rif index field is mapped logically to netdevice ifindex.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 17:11:54 -07:00
Arkadi Sharshevsky 230ead0141 mlxsw: spectrum: Add placeholder for dpipe
Add placeholder for dpipe. Support for specific tables and headers will
be introduced in following patches. The headers are shared between all
mlxsw_sp instances.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 17:11:54 -07:00
Arkadi Sharshevsky 0f630fcbe5 mlxsw: reg: Add counter fields to RITR register
Update RITR for counter support. This allows adding counters for
ASIC's router ports.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 17:11:54 -07:00
Or Gerlitz d7e75a325c net/mlx5e: Add offloading of E-Switch TC pedit (header re-write) actions
This includes calling the parsing code that translates from pedit
speak to the HW API, allocation (deallocation) of a modify header
context and setting the modify header id associated with this
context to the FTE of that flow.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-03-28 15:34:10 +03:00
Or Gerlitz 2f4fe4cab0 net/mlx5e: Add offloading of NIC TC pedit (header re-write) actions
This includes calling the parsing code that translates from pedit
speak to the HW API, allocation (deallocation) of a modify header
context and setting the modify header id associated with this
context to the FTE of that flow.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-03-28 15:34:08 +03:00
Or Gerlitz d79b6df6b1 net/mlx5e: Add parsing of TC pedit actions to HW format
Parse/translate a set of TC pedit actions to be formed in the HW API format.

User-space provides set of keys where each one of them is made of: command (add or
set), header-type, byte offset within that header along with a 32 bit mask and value.

The mask dictates what bits in the 32 bit word that starts on the offset we should
be dealing with, but under negative polarity (unset bits are to be modified).

We do a 1st pass over the set of keys while using the header-type and offset to
fill the masks and the values into a data-structure containting all the
supported network headers.

We then do a 2nd pass over the set of fields to re-write supported by the HW,
where for each such candidate field, we use the masks filled on the 1st pass to
realize if we should offloading re-write it.

In case offloading is required, we fill a HW descriptor with the following:

(1) the header field to modify
(2) the bit offset within the field from where to modify (set command only)
(3) the value to set/add
(4) the length in bits 1...32 to modify (set command only)

Note that it's possible for a given pedit mask to dictate modifying the
same header field multiple times or to modify multiple header fields.
Currently such combinations are not supported for offloading, hence, for set
commands, the offset within the field is always zero, and the length to modify
is the field size.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-03-28 15:34:07 +03:00
Or Gerlitz 2de24fedb9 net/mlx5: Introduce alloc/dealloc modify header context commands
Implement the low-level commands to support packet header re-write.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-03-28 15:34:05 +03:00
Or Gerlitz 2a69cb9ff7 net/mlx5: Introduce modify header structures, commands and steering action definitions
Add the definitions related to creation/deletion of a modify header
context and the modify header steering action which are used for HW
packet header modify (re-write) as part of steering. Add as well the
modify header id into two intermediate structs and set it to the FTE.

Note that as the push/pop vlan steering actions are emulated by the
ewitch management code, we're not breaking any compatibility while
changing their values to make room for the modify header action which
is not emulated and whose value is part of the FW API. The new bit
values for the emulated actions are at the end of the possible range.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-03-28 15:34:04 +03:00
Or Gerlitz a750276f81 net/mlx5: Reorder few command cases to reflect their natural order
Move the commands related to scheduling elements and vport qos to
a suitable location (according to the MLX5_CMD_OP enum values) in
the command string and internal error helpers.

This patch doesn't change any functionality.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-03-28 15:34:03 +03:00
Or Gerlitz e753b2b50d net/mlx5: Add helper to initialize a flow steering actions struct instance
There are bunch of places in the code where the intermediate struct
that keeps the elements related to flow actions is initialized with
the same default values. Put that into a small DECLARE type helper.

This patch doesn't change any functionality.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-03-28 15:34:01 +03:00
Or Gerlitz aa0cbbae5d net/mlx5e: Properly deal with resource cleanup when adding TC flow fails
The code for adding tc fdb flows leaves things half set when it fails
in the middle. Currently we are not leaking things (e.g eswitch
vlan reference, encap reference and HW resources) since the main
code to add flower rules does a cleanup by calling mlx5e_tc_del_flow().

This cleanup further works just b/c we're checking there if the HW rule
for the flow we are attempting to delete is valid before touching it, and
since under the current possible combinations of supported actions it's okay
to go and blidnly deref or delete all the action related resources (encap, vlan).

Instead, do things properly, namely make sure that if add flow fails we
clean all what was allocated or referenced. Now, the flow delete code can
blindly deref/deallocate both the rule and the actions related resources and
when more action combinations are introduced (such as the upcoming header
re-write) we are fine with clear and robust code.

While here, align all of nic/fdb parse actions/add flow functions to get
mlx5e_tc_flow struct param and pick the attributes or whatever else needed
from there.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-03-28 15:34:00 +03:00
Or Gerlitz 17091853fc net/mlx5e: Add intermediate struct for TC flow parsing attributes
Add intermediate structure to store attributes parsed from TC filter
matching/actions parts which are soon to be configured into the HW.

Currently put there the flow matching spec after being parsed. More
content to be added in down-stream patch.

This patch doesn't change any functionality.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-03-28 15:33:59 +03:00
Or Gerlitz 3bc4b7bfa0 net/mlx5e: Add NIC attributes for offloaded TC flows
Add structure that contains the attributes related to offloaded
NIC flows. Currently it has the actions and flow tag.

While here, do xmas tree cleanup of the TC configure function.

This patch doesn't change any functionality.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-03-28 15:33:58 +03:00
Or Gerlitz ecf5bb796b net/mlx5e: Add prefix for e-switch offloaded TC flow attributes
Add esw_ prefix to the flow attributes attached to offloaded e-switch
TC flows. This is a pre-step to add attributes to offloaded NIC TC flows.

Also, save one pointer space by using gcc's zero size array, this would
be beneficial for environments where 100Ks (or Ms) of flows are offloaded.

This patch doesn't change any functionality.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-03-28 15:33:57 +03:00
David S. Miller cc628c9680 mlx5e-failsafe 27-03-2017
This series provides a fail-safe mechanism to allow safely re-configuring
 mlx5e netdevice and provides a resiliency against sporadic
 configuration failures.
 
 To enable this we do some refactoring and code reorganizing to allow
 breaking the drivers open/close flows to stages:
       open -> activate -> deactivate -> close.
 
 In addition we need to allow creating fresh HW ring resources
 (mlx5e_channels) with their own "new" set of parameters, while keeping
 the current ones running and active until the new channels are
 successfully created with the new configuration, and only then we can
 safly replace (switch) old channels with new ones.
 
 For that we introduce mlx5e_channels object and an API to manage it:
  - channels = open_channels(new_params):
    open fresh TX/RX channels
  - activate_channels(channels):
    redirect traffic to them and attach them to the netdev
  - deactivate_channes(channels)
    stop traffic and detach from netdev
  - close(channels)
    Free the TX/RX HW resources of those channels
 
 With the above strategy it is straightforward to achieve the desired
 behavior of fail-safe configuration.  In pseudo code:
 
 make_new_config(new_params)
 {
 	old_channels = current_active_channels;
 	new_channels = create_channels(new_params);
 	if (!new_channels)
 		return "Failed, but current channels are still active :)"
 
 	deactivate_channels(old_channels); /* Can't fail */
 	set_hw_new_state();                /* If needed  */
 	activate_channels(new_channels);   /* Can't fail */
 	close_channels(old_channels);
 	current_active_channels = new_channels;
 
         return "SUCCESS";
 }
 
 At the top of this series, we change the following flows to be fail-safe:
 ethtool:
    - ring parameters
    - coalesce parameters
    - tx copy break parameters
    - cqe compressing/moderation mode setting (priv flags)
 ndos:
    - tc setup
    - set features: LRO
    - change mtu
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJY2XfKAAoJEEg/ir3gV/o+6fAIAKBsqf+EYhbHA0JoTnV1sm3G
 PSGjj5VCMNTZPyDlTWLEpY2S5TIDRPvICC04i5jWFjo5SOmsRMR6ZV0llHukKC4k
 SAkAYU4A78Ds7UhmWzokebwzWa8VA48eqLRxXV60EAhJ0BOgzZnG09KIpzdplE7A
 pco+F/c/qzJa0NP1KQBBrYIcXbGMrCFcYM8d6lJ8TRfVDdZZpeTB/wvxRixKfe1L
 Ji6+k5tbDynDD3+HWkWq+chAkw4yldN7q8fC8FaN2r0mtWYsYbVSPuP+BlL0XN4R
 oluZEJjnyaCePaqUMW+ZYVb1hCGP7pOoJkBb901XdOnX5M2fU9vK3VufWErYF/s=
 =r6Qw
 -----END PGP SIGNATURE-----

Merge tag 'mlx5e-failsafe' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5e-failsafe 27-03-2017

This series provides a fail-safe mechanism to allow safely re-configuring
mlx5e netdevice and provides a resiliency against sporadic
configuration failures.

To enable this we do some refactoring and code reorganizing to allow
breaking the drivers open/close flows to stages:
      open -> activate -> deactivate -> close.

In addition we need to allow creating fresh HW ring resources
(mlx5e_channels) with their own "new" set of parameters, while keeping
the current ones running and active until the new channels are
successfully created with the new configuration, and only then we can
safly replace (switch) old channels with new ones.

For that we introduce mlx5e_channels object and an API to manage it:
 - channels = open_channels(new_params):
   open fresh TX/RX channels
 - activate_channels(channels):
   redirect traffic to them and attach them to the netdev
 - deactivate_channes(channels)
   stop traffic and detach from netdev
 - close(channels)
   Free the TX/RX HW resources of those channels

With the above strategy it is straightforward to achieve the desired
behavior of fail-safe configuration.  In pseudo code:

make_new_config(new_params)
{
	old_channels = current_active_channels;
	new_channels = create_channels(new_params);
	if (!new_channels)
		return "Failed, but current channels are still active :)"

	deactivate_channels(old_channels); /* Can't fail */
	set_hw_new_state();                /* If needed  */
	activate_channels(new_channels);   /* Can't fail */
	close_channels(old_channels);
	current_active_channels = new_channels;

        return "SUCCESS";
}

At the top of this series, we change the following flows to be fail-safe:
ethtool:
   - ring parameters
   - coalesce parameters
   - tx copy break parameters
   - cqe compressing/moderation mode setting (priv flags)
ndos:
   - tc setup
   - set features: LRO
   - change mtu
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-27 21:16:03 -07:00
Jacob Keller 7be147dc14 i40e: initialize params before notifying of l2_param_changes
Probably due to some mis-merging fix a bug associated with commits
d7ce6422d6 ("i40e: don't check params until after checking for client
instance", 2017-02-09) and 3140aa9a78c9 ("i40e: KISS the client
interface", 2017-03-14)

The first commit tried to move the initialization of the params
structure so that we didn't bother doing this if we didn't have a client
interface. You can already see that it looks fishy because of the
indentation. The second commit refactors a bunch of the interface, and
incorrectly drops the params initialization.

I believe what occurred is that internally the two patches were
re-ordered, and the merge conflicts as a result were performed
incorrectly.

Fix the use of an uninitialized variable by correctly initializing the
params variable via i40e_client_get_params().

Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:47:44 -07:00
Colin Ian King 703ba88548 i40evf: dereference VSI after VSI has been null checked
VSI is being dereferenced before the VSI null check; if VSI is
null we end up with a null pointer dereference.  Fix this by
performing VSI deference after the VSI null check.  Also remove
the need for using adapter by using vsi->back->cinst.

Detected by CoverityScan, CID#1419696, CID#1419697
("Dereference before null check")

Fixes: ed0e894de7 ("i40evf: add client interface")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:47:44 -07:00
Alexander Duyck c76cb6ed54 i40e: Drop FCoE code that always evaluates to false or 0
Since FCoE isn't supported by the i40e products there isn't much point in
carrying around code that will always evaluate to false. This patch goes
through and strips out the code in several spots so that we don't go around
carrying variables and/or code that is always going to evaluate to false or
0.

Change-ID: I39d1d779c66c638b75525839db2b6208fdc809d7
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:47:44 -07:00
Alexander Duyck 9eed69a914 i40e: Drop FCoE code from core driver files
Looking over the code for FCoE it looks like the Rx path has been broken at
least since the last major Rx refactor almost a year ago.  It seems like
FCoE isn't supported for any of the Fortville/Fortpark hardware so there
isn't much point in carrying the code around, especially if it is broken
and untested.

Change-ID: I892de8fa551cb129ce2361e738ff82ce55fa229e
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:47:43 -07:00
Alexander Duyck a5b268e4b1 i40e/i40evf: Clean-up process_skb_fields
This is a minor clean-up to make the i40e/i40evf process_skb_fields
function look a little more like what we have in igb.  The Rx checksum
function called out a need for skb->protocol but I can't see where it
actually needs it.  I am assuming this is something that was likely
refactored out some time ago as the Rx checksum code has gone through a few
rewrites.

Change-ID: I0b4668a34d90b61b66ded7c7c26e19a3e2d06251
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:47:43 -07:00
Bimmy Pujari 0a25b7311d i40e: removed no longer needed delays
Removed no longer needed delays.  At preproduction stage those delays were
needed but now these delays are not needed.

Signed-off-by: Bimmy Pujari <bimmy.pujari@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:47:43 -07:00
Robert Konklewski beff3e9d80 i40e: Fixed race conditions in VF reset
First, this patch eliminates IOMMU DMAR Faults caused by VF hardware.
This is done by enabling VF hardware only after VSI resources are
freed. Otherwise, hardware could DMA into memory that is (or just has
been) being freed.

Then, the VF driver is activated only after VSI resources have been
reallocated. That's because the VF driver can request resources
immediately after it's activated. So they need to be ready at that
point.

The second race condition happens when the OS initiates a VF reset,
and then before it's finished modifies VF's settings by changing its
MAC, VLAN ID, bandwidth allocation, anti-spoof checking, etc. These
functions needed to be blocked while VF is undergoing reset. Otherwise,
they could operate on data structures that had just been freed or not
yet fully initialized.

Change-ID: I43ba5a7ae2c9a1cce3911611ffc4598ae33ae3ff
Signed-off-by: Robert Konklewski <robertx.konklewski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:45:14 -07:00
Alexander Duyck 741b8b832a i40e/i40evf: Fix use after free in Rx cleanup path
We need to reset skb back to NULL when we have freed it in the Rx cleanup
path.  I found one spot where this wasn't occurring so this patch fixes it.

Change-ID: Iaca68934200732cd4a63eb0bd83b539c95f8c4dd
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:45:14 -07:00
Harshitha Ramamurthy f25571b576 i40e: fix configuration of RSS table with DCB
There exists a bug in the driver where the calculation of the
RSS size was not taking into account the number of traffic classes
enabled. This patch factors in the traffic classes both in
the initial configuration of the table as well as reconfiguration.

Change-ID: I34dcd345ce52faf1d6b9614bea28d450cfd5f621
Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:45:14 -07:00
Alexander Duyck 1793668c3b i40e/i40evf: Update code to better handle incrementing page count
Update the driver code so that we do bulk updates of the page reference
count instead of just incrementing it by one reference at a time.  The
advantage to doing this is that we cut down on atomic operations and
this in turn should give us a slight improvement in cycles per packet.
In addition if we eventually move this over to using build_skb the gains
will be more noticeable.

I also found and fixed a store forwarding stall from where we were
assigning "*new_buff = *old_buff".  By breaking it up into individual
copies we can avoid this and as a result the performance is slightly
improved.

Change-ID: I1d3880dece4133eca3c32423b04a5467321ccc52
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:45:13 -07:00
Tobias Klauser 656455bf19 net: ibmvnic: Remove unused net_stats member from struct ibmvnic_adapter
The ibmvnic driver keeps its statistics in net_device->stats, so the
net_stats member in struct ibmvnic_adapter is unused. Remove it.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-27 16:02:00 -07:00
Tobias Klauser 8c2ef1978f net: ibmveth: Remove unused stats member from struct ibmveth_adapter
The ibmveth driver keeps its statistics in net_device->stats, so the
stats member in struct ibmveth_adapter is unused. Remove it.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-27 16:02:00 -07:00
Tobias Klauser 8570bcd83d net: bfin_mac: Remove unused stats member from struct bfin_mac_local
The bfin_mac driver keeps its statistics in net_device->stats, so the
stats member in struct bfin_mac_local is unused. Remove it, as well as
the accompanying comment.

Cc: adi-buildroot-devel@lists.sourceforge.net
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-27 16:01:59 -07:00
Philippe Reynes 86573f6152 net: tehuti: use new api ethtool_{get|set}_link_ksettings
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-27 16:00:07 -07:00
Saeed Mahameed 2e20a15120 net/mlx5e: Fail safe mtu and lro setting
Use the new fail-safe channels switch mechanism to set new
netdev mtu and lro settings.

MTU and lro settings demand some HW configuration changes after new
channels are created and ready for action. In order to unify switch
channels routine for LRO and MTU changes, and maybe future configuration
features, we now pass to it a modify HW function pointer to be
invoked directly after old channels are de-activated and before new
channels are activated.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:24 +03:00
Saeed Mahameed 6f9485af40 net/mlx5e: Fail safe tc setup
Use the new fail-safe channels switch mechanism to set up new
tc parameters.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:23 +03:00
Saeed Mahameed be7e87f92b net/mlx5e: Fail safe cqe compressing/moderation mode setting
Use the new fail-safe channels switch mechanism to set new
CQE compressing and CQE moderation mode settings.

We also move RX CQE compression modify function out of en_rx file  to
a more appropriate place.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:22 +03:00
Saeed Mahameed 546f18ed3f net/mlx5e: Fail safe ethtool settings
Use the new fail-safe channels switch mechanism to set new ethtool
settings:
 - ring parameters
 - coalesce parameters
 - tx copy break parameters

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:21 +03:00
Saeed Mahameed 55c2503dae net/mlx5e: Introduce switch channels
A fail safe helper functions that allows switching to new channels on the
fly,  In simple words:

make_new_config(new_params)
{
    new_channels = open_channels(new_params);
    if (!new_channels)
         return "Failed, but current channels are still active :)"

    switch_channels(new_channels);

    return "SUCCESS";
}

Demonstrate mlx5e_switch_priv_channels usage in set channels ethtool
callback and make it fail-safe using the new switch channels mechanism.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:20 +03:00
Saeed Mahameed 9008ae0748 net/mlx5e: Minimize mlx5e_{open/close}_locked
mlx5e_redirect_rqts_to_{channels,drop} and mlx5e_{add,del}_sqs_fwd_rules
and Set real num tx/rx queues belong to
mlx5e_{activate,deactivate}_priv_channels, for that we move those functions
and minimize mlx5e_open/close flows.

This will be needed in downstream patches to replace old channels with new
ones without the need to call mlx5e_close/open.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:19 +03:00
Saeed Mahameed a43b25daef net/mlx5e: CQ and RQ don't need priv pointer
Remove mlx5e_priv pointer from CQ and RQ structs,
it was needed only to access mdev pointer from priv pointer.

Instead we now pass mdev where needed.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:18 +03:00
Saeed Mahameed 6a9764efb2 net/mlx5e: Isolate open_channels from priv->params
In order to have a clean separation between channels resources creation
flows and current active mlx5e netdev parameters, make sure each
resource creation function do not access priv->params, and only works
with on a new fresh set of parameters.

For this we add "new" mlx5e_params field to mlx5e_channels structure
and use it down the road to mlx5e_open_{cq,rq,sq} and so on.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:18 +03:00
Saeed Mahameed acc6c5953a net/mlx5e: Split open/close channels to stages
As a foundation for safe config flow, a simple clear API such as
(Open then Activate) where the "Open" handles the heavy unsafe
creation operation and the "activate" will be fast and fail safe,
to enable the newly created channels.

For this we split the RQs/TXQ SQs and channels open/close flows to
open => activate, deactivate => close.

This will simplify the ability to have fail safe configuration changes
in downstream patches as follows:

make_new_config(new_params)
{
     old_channels = current_active_channels;
     new_channels = create_channels(new_params);
     if (!new_channels)
              return "Failed, but current channels still active :)"
     deactivate_channels(old_channels); /* Can't fail */
     activate_channels(new_channels); /* Can't fail */
     close_channels(old_channels);
     current_active_channels = new_channels;

     return "SUCCESS";
}

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:17 +03:00
Saeed Mahameed b676f65389 net/mlx5e: Refactor refresh TIRs
Rename mlx5e_refresh_tirs_self_loopback to mlx5e_refresh_tirs,
as it will be used in downstream (Safe config flow) patches, and make it
fail safe on mlx5e_open.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:16 +03:00
Saeed Mahameed a5f97fee74 net/mlx5e: Redirect RQT refactoring
RQ Tables are always created once (on netdev creation) pointing to drop RQ
and at that stage, RQ tables (indirection tables) are always directed to
drop RQ.

We don't need to use mlx5e_fill_{direct,indir}_rqt_rqns to fill the drop
RQ in create RQT procedure.

Instead of having separate flows to redirect direct and indirect RQ Tables
to the current active channels Receive Queues (RQs), we unify the two
flows by introducing mlx5e_redirect_rqt function and redirect_rqt_param
struct. Combined, they provide one generic logic to fill the RQ table RQ
numbers regardless of the RQ table purpose (direct/indirect).

Demonstrated the usage with mlx5e_redirect_rqts_to_channels which will
be called on mlx5e_open and with mlx5e_redirect_rqts_to_drop which will
be called on mlx5e_close.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:15 +03:00
Saeed Mahameed ff9c852f91 net/mlx5e: Introduce mlx5e_channels
Have a dedicated "channels" handler that will serve as channels
(RQs/SQs/etc..) holder to help with separating channels/parameters
operations, for the downstream fail-safe configuration flow, where we will
create a new instance of mlx5e_channels with the new requested parameters
and switch to the new channels on the fly.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:14 +03:00
Saeed Mahameed be4891af77 net/mlx5e: Set netdev->rx_cpu_rmap on netdev creation
To simplify mlx5e_open_locked flow we set netdev->rx_cpu_rmap on netdev
creation rather on netdev open, it is redundant to set it every time on
mlx5e_open_locked.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:13 +03:00
Saeed Mahameed 7f859ecfa8 net/mlx5e: Set SQ max rate on mlx5e_open_txqsq rather on open_channel
Instead of iterating over the channel SQs to set their max rate, do it
on SQ creation per TXQ SQ.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:12 +03:00
Arnd Bergmann 834a61d455 net: hns: avoid gcc-7.0.1 warning for uninitialized data
hns_dsaf_set_mac_key() calls dsaf_set_field() on an uninitialized field,
which will then change only a few of its bits, causing a warning with
the latest gcc:

hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_set_mac_uc_entry':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may be used uninitialized in this function [-Werror=maybe-uninitialized]
   (origin) &= (~(mask)); \
            ^~
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_set_mac_mc_entry':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may be used uninitialized in this function [-Werror=maybe-uninitialized]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_add_mac_mc_port':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may be used uninitialized in this function [-Werror=maybe-uninitialized]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_del_mac_entry':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may be used uninitialized in this function [-Werror=maybe-uninitialized]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_rm_mac_addr':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may be used uninitialized in this function [-Werror=maybe-uninitialized]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_del_mac_mc_port':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may be used uninitialized in this function [-Werror=maybe-uninitialized]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_get_mac_uc_entry':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may be used uninitialized in this function [-Werror=maybe-uninitialized]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_get_mac_mc_entry':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may be used uninitialized in this function [-Werror=maybe-uninitialized]

The code is actually correct since we always set all 16 bits of the
port_vlan field, but gcc correctly points out that the first
access does contain uninitialized data.

This initializes the field to zero first before setting the
individual bits.

Fixes: 5483bfcb16 ("net: hns: modify tcam table and set mac key")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-25 20:05:32 -07:00
Arnd Bergmann a17f1861b5 net: hns: fix uninitialized data use
When dev_dbg() is enabled, we print uninitialized data, as gcc-7.0.1
now points out:

ethernet/hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_set_promisc_tcam':
ethernet/hisilicon/hns/hns_dsaf_main.c:2947:75: error: 'tbl_tcam_data.low.val' may be used uninitialized in this function [-Werror=maybe-uninitialized]
ethernet/hisilicon/hns/hns_dsaf_main.c:2947:75: error: 'tbl_tcam_data.high.val' may be used uninitialized in this function [-Werror=maybe-uninitialized]

We also pass the data into hns_dsaf_tcam_mc_cfg(), which might later
use it (not sure about that), so it seems safer to just always initialize
the tbl_tcam_data structure.

Fixes: 1f5fa2dd1c ("net: hns: fix for promisc mode in HNS driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-25 20:05:32 -07:00
Arkadi Sharshevsky 1312444374 mlxsw: spectrum_kvdl: Cosmetic kvdl allocator API change
Currently the return allocated index and err value are multiplexed.
This patch changes the API to decouple the ret value from the allocated
index.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-25 19:56:15 -07:00
Alexander Duyck 13a8cd191a i40e: Do not enable NAPI on q_vectors that have no rings
When testing the epoll w/ busy poll code I found that I could get into a
state where the i40e driver had q_vectors w/ active NAPI that had no rings.
This was resulting in a divide by zero error.  To correct it I am updating
the driver code so that we only support NAPI on q_vectors that have 1 or
more rings allocated to them.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:28:55 -07:00
Saeed Mahameed 3139104861 net/mlx5e: Different SQ types
Different SQ types (tx, xdp, ico) are growing apart, we separate them
and remove unwanted parts in each one of them, to simplify data path and
utilize data cache.

Remove DB union from SQ structures since it is not needed anymore as we
now have different SQ data type for each SQ.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:46 -07:00
Saeed Mahameed 33ad971186 net/mlx5e: Generalize SQ create/modify/destroy functions
In the next patches we will introduce different SQ types,
and we would want to reuse those functions, in this patch we make them
agnostic to SQ type (txq, xdp, ico).

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:46 -07:00
Saeed Mahameed 3b77235b94 net/mlx5e: Proper names for SQ/RQ/CQ functions
Rename mlx5e_{create,destroy}_{sq,rq,cq} to
mlx5e_{alloc,free}_{sq,rq,cq}.

Rename mlx5e_{enable,disable}_{sq,rq,cq} to
mlx5e_{create,destroy}_{sq,rq,cq}.

mlx5e_{enable,disable}_{sq,rq,cq} used to actually create/destroy the SQ
in FW, so we rename them to align the functions names with FW semantics.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:46 -07:00
Saeed Mahameed 864b2d7153 net/mlx5e: Generalize tx helper functions for different SQ types
In the next patches we will introduce different SQ types, for that we here
generalize some TX helper functions to work with more basic SQ parameters,
in order to re-use them for the different SQ types.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:46 -07:00
Saeed Mahameed 2239185ccd net/mlx5e: Optimize XDP frame xmit
XDP SQ has a fixed size WQE (MLX5E_XDP_TX_WQEBBS = 1) and only posts
one kind of WQE (MLX5_OPCODE_SEND),

Also we initialize SQ descriptors static fields once on open_xdpsq,
rather than every time on critical path.

Optimize the code in light of those facts and add a prefetch of the TX
descriptor first thing in the xdp xmit function.

Performance improvement:
System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz

Test case              Before     Now        improvement
---------------------------------------------------------------
XDP TX   (1 core)      13Mpps    13.7Mpps       5%

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:46 -07:00
Saeed Mahameed 39e12351a3 net/mlx5e: Poll XDP TX CQ before RX CQ
Handle XDP TX completions before handling RX packets, to make sure more
free space is available for XDP TX packets a moment before handling
RX packets.

Performance improvement:
System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz

Test case              Before     Now      improvement
---------------------------------------------------------------
XDP Drop (1 core)      16.9Mpps  16.9Mpps    No change
XDP TX   (1 core)      12Mpps    13Mpps      8%

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:46 -07:00
Saeed Mahameed 31871f87bb net/mlx5e: Move XDP SQ instance into RQ
To save many rq->channel->sq dereferences in fast-path.
And rename it to xdpsq.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:45 -07:00
Saeed Mahameed eba2db2bd2 net/mlx5e: Move mlx5e_rq struct declaration
Move struct mlx5e_rq and friends to appear after mlx5e_sq declaration in
en.h.

We will need this for next patch to move the mlx5e_sq instance into
mlx5e_rq struct for XDP SQs.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:45 -07:00
Saeed Mahameed 1c4bf94045 net/mlx5e: Move XDP completion functions to rx file
XDP code belongs to RX path, move mlx5e_poll_xdp_tx_cq and
mlx5e_free_xdp_tx_descs to en_rx.c.

Rename them to mlx5e_poll_xdpsq_cq and mlx5e_free_xdpsq_descs.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:45 -07:00
Saeed Mahameed aff2615763 net/mlx5e: Single bfreg (UAR) for all mlx5e SQs and netdevs
One is sufficient since Blue Flame is not supported anymore.
This will also come in handy for switchdev mode to save resources, since
VF representors will use same single UAR as well for their own SQs.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:45 -07:00
Saeed Mahameed 6982ab6097 net/mlx5e: Xmit, no write combining
mlx5e netdev Blue Flame (write combining) support demands a lot of
overhead for a little latency gain for some special cases, this overhead
is hurting the common case.

Here we remove xmit Blue Flame support by creating all bfregs with no
write combining for all SQs, and we remove a lot of BF logic and
conditions from xmit data path.

Simplify mlx5e_tx_notify_hw (doorbell function) by removing BF related
code and by removing one memory barrier needed for WC mapped SQ doorbell
buffers, which no longer exist.

Performance improvement:
System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz

Test case                   Before      Now      improvement
---------------------------------------------------------------
TX packets (24 threads)     50Mpps      54Mpps    8%

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:45 -07:00
Saeed Mahameed 80fe326ab8 net/mlx5e: Use dma_rmb rather than rmb in CQE fetch routine
Use dma_rmb in mlx5e_get_cqe rather than aggressive rmb (at least on
some architectures), this should help improve the performance on such
CPU archs where dma_rmb is optimized.

Performance improvement:
System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz

Test case                   Baseline      Now      improvement
---------------------------------------------------------------
TX packets (24 threads)     45Mpps        50Mpps      11%
TC stack Drop (1 core)      3.45Mpps      3.6Mpps     5%
XDP Drop      (1 core)      14Mpps        16.9Mpps    20%
XDP TX        (1 core)      10.4Mpps      12Mpps      15%

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:44 -07:00
Ido Schimmel 18281f2dab mlxsw: spectrum: Query cell size from firmware
As explained in the previous patch, the cell size may change in future
devices, so query it from the firmware instead of hard coding it.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 13:53:29 -07:00
Ido Schimmel f417f04da5 mlxsw: spectrum: Refactor port buffer configuration
The sizes and thresholds of the priority group (PG) buffers are
configured in cells, which represent a specific amount of bytes.

The cell size can vary in different devices, so it's better to query it
from the firmware than hard coding it.

Refactor the code dealing with this value into different functions, so
that it will be easier to make the conversion in the next patch.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 13:53:29 -07:00
Ido Schimmel d3daae1b08 mlxsw: spectrum_buffers: Query shared buffer size from firmware
Instead of hard coding the size of the shared buffer in the driver,
query it from the firmware, as it may change in future devices.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 13:53:28 -07:00
Ido Schimmel 5ec2ee7dd2 mlxsw: Query maximum number of ports from firmware
We currently hard code the maximum number of ports in the driver, but
this may change in future devices, so query it from the firmware
instead.

Fallback to a maximum of 64 ports in case this number can't be queried.
This should only happen in SwitchX-2 for which this number is correct.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 13:53:28 -07:00
Ido Schimmel 8494ab06e0 mlxsw: spectrum_router: Query number of LPM trees from firmware
Instead of hard coding the number of LPM trees in the driver, query it
from the firmware, as it may change in future devices.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 13:53:28 -07:00
David S. Miller ba82427d4a Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2017-03-23

This series contains updates to i40e and i40e.txt documentation.

Jake provides all the changes in the series which are centered around
ntuple filter fixes and additional support.  Fixed the current
implementation of .set_rxnfc, where we were not reading the mask field
for filter entries which was resulting in filters not behaving as
expected and not working correctly.  When cleaning up after disabling
flow director support, ensure that the default input set is correctly
reprogrammed.  Since the hardware only supports a single input set for
all flows of that type, the driver shall only allow the input set to
change if there are no other configured filters for that flow type, so
add support to detect when we can update the input set for each flow
type.  Align the driver to other drivers to partition the ring_cookie
value into 8bits of VF index, along with 32bits of queue number instead
of using the user-def field.  Added support to parse the user-def field
into a data structure format to allow future extensions of the user-def
filed by keeping all the code that read/writes the field into a single
location.  Added support for flexible payloads passed via ethtool
user-def field.  We support a single flexible word (2byte) value per
protocol type, and we handle the FLX_PIT register using a list of
flexible entries so that each flow type may be configured separately.
Enabled flow director filters for SCTPv4 packets using the ethtool
ntuple interface to enable filters.  Updated the documentation on the
i40e driver to include the newly added support to ntuple filters.
Reduced complexity of a if-continue-else-break section of code by
taking advantage of using hlist_for_each_entry_continue() instead.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 13:45:07 -07:00
Jeff Kirsher 9f47a48e6e Revert "e1000e: driver trying to free already-free irq"
This reverts commit 7e54d9d063.

After additional regression testing, several users are experiencing
kernel panics during shutdown on e1000e devices.  Reverting this
change resolves the issue.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 13:43:11 -07:00
Felix Manlunas 7cc61db9c7 liquidio: do not reset Octeon if NIC firmware was preloaded
The PF driver is incorrectly resetting Octeon when the module parameter
"fw_type=none" is there.  "fw_type=none" means the PF should not load any
firmware to the NIC because Octeon is already running preloaded firmware.

Fix it by putting an if (fw_type != none) around the reset code.

Because the Octeon reset is now conditionally gone, when unloading the
driver, conditionally send the RESET_PF command to the firmware who will
then free up PF-related data structures.

Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 13:20:43 -07:00
Florian Fainelli e9d7af78b2 net: systemport: Simplify circular pointer arithmetic
Similar to c298ede2fe ("net: bcmgenet: simplify circular pointer
arithmetic") we don't need to complex arthimetic since we always have a
ring size that is a power of 2.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 12:53:15 -07:00
Florian Fainelli 6baa785a9c net: systemport: Clear status to reduce spurious interrupts
Do something similar to commit d5810ca325 ("net: bcmgenet: clear
status to reduce spurious interrupts") and clear interrupts right before
servicing them. This reduces the number of interrupts by 10K
interrupts/sec for a TX TCP session 1Gbits/sec.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 12:53:14 -07:00
Florian Fainelli 30defeb2fb net: systemport: Track per TX ring statistics
bcm_sysport_tx_reclaim_one() is currently summing TX bytes/packets in a
way that is not SMP friendly, mutliples CPUs could run
bcm_sysport_tx_reclaim_one() independently and still update
stats->tx_bytes and stats->tx_packets, cloberring the other CPUs
statistics.

Fix this by tracking per TX rings the number of bytes, packets,
dropped and errors statistics, and provide a bcm_sysport_get_nstats()
function which aggregates everything and returns a consistent output.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 12:53:14 -07:00
Arnd Bergmann a5af839253 bna: avoid writing uninitialized data into hw registers
The latest gcc-7 snapshot warns about bfa_ioc_send_enable/bfa_ioc_send_disable
writing undefined values into the hardware registers:

drivers/net/ethernet/brocade/bna/bfa_ioc.c: In function 'bfa_iocpf_sm_disabling_entry':
arch/arm/include/asm/io.h:109:22: error: '*((void *)&disable_req+4)' is used uninitialized in this function [-Werror=uninitialized]
arch/arm/include/asm/io.h:109:22: error: '*((void *)&disable_req+8)' is used uninitialized in this function [-Werror=uninitialized]

The two functions look like they should do the same thing, but only one
of them initializes the time stamp and clscode field. The fact that we
only get a warning for one of the two functions seems to be arbitrary,
based on the inlining decisions in the compiler.

To address this, I'm making both functions do the same thing:

- set the clscode from the ioc structure in both
- set the time stamp from ktime_get_real_seconds (which also
  avoids the signed-integer overflow in 2038 and extends the
  well-defined behavior until 2106).
- zero-fill the reserved field

Fixes: 8b230ed8ec ("bna: Brocade 10Gb Ethernet device driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 12:49:12 -07:00
LABBE Corentin 270c7759fb net: stmmac: add set_mac to the stmmac_ops
Two different set_mac functions exists but stmmac_dwmac4_set_mac() is
only used for enabling and never for disabling.
So on dwmac4, the MAC RX/TX is never disabled.

This patch add a generic function pointer set_mac() to stmmac_ops and
replace all call to stmmac_set_mac/stmmac_dwmac4_set_mac by a call to
this pointer.

Since dwmac4_ops is const, set_mac cannot be modified after, and so dwmac4_ops
is duplioacted like dwmac4_dma_ops.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 12:36:42 -07:00
Pavel Belous 5d73bb863c net:ethernet:aquantia: Reset is_gso flag when EOP reached.
We need to reset is_gso flag when EOP reached (entire LSO packet processed).

Fixes: bab6de8fd1 ("net: ethernet: aquantia:
 Atlantic A0 and B0 specific functions.")

Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 12:32:19 -07:00
Pavel Belous 386aff88e3 net:ethernet:aquantia: Fix for LSO with IPv6.
Fix Context Command bit: L3 type = "0" for IPv4, "1" for IPv6.

Fixes: bab6de8fd1 ("net: ethernet: aquantia:
 Atlantic A0 and B0 specific functions.")

Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 12:32:19 -07:00
Pavel Belous 9f0dd8c322 net:ethernet:aquantia: Missing spinlock initialization.
Fix for missing initialization aq_ring header.lock spinlock.

Fixes: 018423e90b ("net: ethernet: aquantia: Add ring support code")

Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 12:32:19 -07:00
Pavel Belous ea0504f554 net:ethernet:aquantia: Fix packet type detection (TCP/UDP) for IPv6.
In order for the checksum offloads to work correctly we need to set the
packet type bit (TCP/UDP) in the TX context buffer.

Fixes: 97bde5c4f9 ("net: ethernet: aquantia: Support for NIC-specific code")

Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Tested-by: David Arcari <darcari@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 12:32:19 -07:00
Pavel Belous 1adbddef11 net:ethernet:aquantia: Remove adapter re-opening when MTU changed.
Closing/opening the adapter is not needed at all.
The new MTU settings take effect immediately.

Fixes: 97bde5c4f9 ("net: ethernet: aquantia: Support for NIC-specific code")

Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 12:32:19 -07:00
Ido Schimmel 9a32562bec mlxsw: Remove debugfs interface
We don't use it during development and we can't extend it either, so
remove it.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 21:29:32 -07:00
Jacob Keller 584a88709b i40e: make use of hlist_for_each_entry_continue
Replace a complex if->continue->else->break construction in
i40e_next_filter. We can simply use hlist_for_each_entry_continue
instead. This drops a lot of confusing code. The resulting code is much
easier to understand the intention, and follows the more normal pattern
for using hlist loops. We could have also used a break with a "return
next" at the end of the function, instead of return NULL, but the
current implementation is explicitly clear that when you reach the end
of the loop you get a NULL value. The alternative construction is less
clear since the reader would have to know that next is NULL at the end
of the loop.

Change-Id: Ife74ca451dd79d7f0d93c672bd42092d324d4a03
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-23 21:13:33 -07:00
Jacob Keller f223c8752a i40e: add support for SCTPv4 FDir filters
Enable FDir filters for SCTPv4 packets using the ethtool ntuple
interface to enable filters. The ethtool API does not allow masking on
the verification tag.

Change-Id: I093e88a8143994c7e6f4b7b17a0bd5cf861d18e4
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-23 21:13:33 -07:00
Jacob Keller 0e588de17f i40e: implement support for flexible word payload
Add support for flexible payloads passed via ethtool user-def field.
This support is somewhat limited due to hardware design. The input set
can only be programmed once per filter type, and the flexible offset is
part of this filter input set. This means that the user cannot program
both a regular and a flexible filter at the same time for a given flow
type. Additionally, the user may not program two flexible filters of the
same flow type with different offsets, although they are allowed to
configure different values at that offset location.

We support a single flexible word (2byte) value per protocol type, and
we handle the FLX_PIT register using a list of flexible entries so that
each flow type may be configured separately.

Due to hardware implementation, the flexible data is offset from the
start of the packet payload, and thus may not be in part of the header
data. For this reason, the offset provided by the user defined data is
interpreted as a byte offset from the start of the matching payload.
Previous implementations have tried to represent the offset as from the
start of the frame, but this is not feasible because header sizes may
change due to options.

Change-Id: 36ed27995e97de63f9aea5ade5778ff038d6f811
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-23 21:13:33 -07:00
Jacob Keller e793095e8a i40e: add parsing of flexible filter fields from userdef
Add code to parse the user-def field into a data structure format. This
code is intended to allow future extensions of the user-def field by
keeping all code that actually reads and writes the field into a single
location. This ensures that we do not litter the driver with references
to the user-def field and minimizes the amount of bitwise operations we
need to do on the data.

Add code which parses the lower 32bits into a flexible word and its
offset. This will be used in a future patch to enable flexible filters
which can match on some arbitrary data in the packet payload. For now,
we just return -EOPNOTSUPP when this is used.

Add code to fill in the user-def field when reporting the filter back,
even though we don't actually implement any user-def fields yet.

Additionally, ensure that we mask the extended FLOW_EXT bit from the
flow_type now that we will be accepting filters which have the FLOW_EXT
bit set (and thus make use of the user-def field).

Change-Id: I238845035c179380a347baa8db8223304f5f6dd7
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-23 21:13:33 -07:00
Jacob Keller 43b15697a3 i40e: partition the ring_cookie to get VF index
Do not use the user-def field for determining the VF target. Instead,
similar to ixgbe, partition the ring_cookie value into 8bits of VF
index, along with 32bits of queue number. This is better than using the
user-def field, because it leaves the field open for extension in
a future patch which will enable flexible data. Also, this matches with
convention used by ixgbe and other drivers.

Change-Id: Ie36745186d817216b12f0313b99ec95cb8a9130c
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-23 21:13:33 -07:00
Jacob Keller 9229e99334 i40e: allow changing input set for ntuple filters
Add support to detect when we can update the input set for each flow
type.

Because the hardware only supports a single input set for all flows of
that matching type, the driver shall only allow the input set to change
if there are no other configured filters for that flow type.

Thus, the first filter added for each flow type is allowed to change the
input set, and all future filters must match the same input set. Display
a diagnostic message whenever the filter input set changes, and
a warning whenever a filter cannot be accepted because it does not match
the configured input set.

Change-Id: Ic22e1c267ae37518bb036aca4a5694681449f283
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-23 21:13:33 -07:00
Jacob Keller 3bcee1e653 i40e: restore default input set for each flow type
Ensure that the default input set is correctly reprogrammed when
cleaning up after disabling flow director support. This ensures that the
programmed value will be in a clean state.

Although we do not yet have support for SCTPv4 filters, a future patch
will add support for this protocol, so we will correctly restore the
SCTPv4 input set here as well. Note that strictly speaking the default
hardware value for SCTP includes matching the verification tag. However,
the ethtool API does not have support for specifying this value, so
there is no reason to keep the verification field enabled.

This patch is the next step on the way to enabling partial tuple filters
which will be implemented in a following patch.

Change-Id: Ic22e1c267ae37518bb036aca4a5694681449f283
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-23 21:13:33 -07:00
Jacob Keller 36777d9fa2 i40e: check current configured input set when adding ntuple filters
Do not assume that hardware has been programmed with the default mask,
but instead read the input set registers to determine what is currently
programmed. This ensures that all programmed filters match exactly how
the hardware will interpret them, avoiding confusion regarding filter
behavior.

This sets the initial ground-work for allowing custom input sets where
some fields are disabled. A future patch will fully implement this
feature.

Instead of using bitwise negation, we'll just explicitly check for the
correct value. The use of htonl and htons are used to silence sparse
warnings. The compiler should be able to handle the constant value and
avoid actually performing a byteswap.

Change-Id: I3d8db46cb28ea0afdaac8c5b31a2bfb90e3a4102
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-23 21:13:33 -07:00
Jacob Keller faa16e0f38 i40e: correctly honor the mask fields for ETHTOOL_SRXCLSRLINS
The current implementation of .set_rxnfc does not properly read the mask
field for filter entries. This results in incorrect driver behavior, as
we do not reject filters which have masks set to ignore some fields. The
current implementation simply assumes that every part of the tuple or
"input set" is specified. This results in filters not behaving as
expected, and not working correctly.

As a first step in supporting some partial filters, add code which
checks the mask fields and rejects any filters which do not have an
acceptable mask. For now, we just assume that all fields must be set.

This will get the driver one step towards allowing some partial filters.
At a minimum, the ethtool commands which previously installed filters
that would not function will now return a non-zero exit code indicating
failure instead.

We should now be meeting the minimum requirements of the .set_rxnfc API,
by ensuring that all filters we program have a valid mask value for each
field.

Finally, add code to report the mask correctly so that the ethtool
command properly reports the mask to the user.

Note that the typecast to (__be16) when checking source and destination
port masks is required because the ~ bitwise negation operator does not
correctly handle variables other than integer size.

Change-Id: Ia020149e07c87aa3fcec7b2283621b887ef0546f
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-23 21:13:33 -07:00
Jie Deng 67ff2c71bb net: dwc-xlgmac: use dual license
The driver "dwc-xlgmac" is dual-licensed.
Declare the dual license with MODULE_LICENSE().

Signed-off-by: Jie Deng <jiedeng@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 17:04:14 -07:00
Jie Deng ea8c1c642e net: dwc-xlgmac: declaration of dual license in headers
The driver "dwc-xlgmac" is dual-licensed. This patch adds
declaration of dual license in file headers.

Signed-off-by: Jie Deng <jiedeng@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 17:04:14 -07:00
David S. Miller 16ae1f2236 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/broadcom/genet/bcmmii.c
	drivers/net/hyperv/netvsc.c
	kernel/bpf/hashtab.c

Almost entirely overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 16:41:27 -07:00
Mintz, Yuval dec26533ae qed: Reserve VF feature before PF
Align the driver feature distribution with the flow utilized
by the management firmware - first reserve L2 queues for
VFs and use all the remaining for the PF.

The current distribution might lead to PFs with an enormous
amount of queues, but at the same time leave us with insufficient
resources for starting all VFs.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 11:53:30 -07:00
Mintz, Yuval 810bb1f0d3 qed: Don't waste SBs unused by RoCE
When RoCE is enabled on a given L2 interface, the interrupt lines
are divided equally between L2 and RoCE -
But in case number of lines needed for RoCE is limited by number
of available CNQs, we can utilize the additional lines for L2.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 11:53:30 -07:00
Mintz, Yuval 3981594409 qed: Reduce verbosity of unimplemented MFW messages
Management firmware and driver are meant to be both backward and forward
compatibile with each other.

If a new mangement firmware would work with an older driver,
it's possible that driver would receive indications which are meaningless
to it. That's perfectly acceptible from the firmware part - so no need to
log such messages at default verbosity; That would only serve to confuse
users.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 11:53:30 -07:00
Mintz, Yuval 1799100207 qed: Correct endian order of MAC passed to MFW
The management firmware is running on a Big Endian processor,
and when running on LE platform HW is configured to swap access
to memory shared between management firmware and driver on
32-bit granulariy.

As a result, for matters of simplicity most of the APIs between
driver and management firmware are based on 32-bit variables.
MAC settings are one exception, as driver needs to fill a byte
array when indicating to management firmware that primary MAC
has changed.
Due to the swap, driver must make sure that the mac that was
provided in byte-order would be translated into native order,
otherwise after the swap the management firmware would read
it swapped.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 11:53:30 -07:00
Tomer Tayar 2f67af8c59 qed: Pass src/dst sizes when interacting with MFW
The driver interaction with management firmware involves a union
of all the data-members relating to the commands the driver prepares.

Current interface assumes the caller always passes such a union -
but thats cumbersome as well as risky [chancing a stack corruption
in case caller accidentally passes a smaller member instead of union].

Change implementation so that caller could pass a pointer to any
of the members instead of the union.

Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 11:53:29 -07:00
Tomer Tayar 4ed1eea82a qed: Revise MFW command locking
Interaction of driver -> management firmware is based
on a one-pending mailbox [per interface], and various
mailbox commands need to be synchronized.

Current scheme is messy, and there's a difficulty extending
it as it deals differently with various commands as well as
making assumption on the required behavior for load/unload
requests.

Drop the current scheme into a completion-list-based approach;
Each flow would try sending the command when possible,
allowing one flow to complete another flow's completion and
relieve the mailbox before sending its own command.

Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 11:53:29 -07:00
Pavel Belous 68c3865903 net:ethernet:aquantia: Fix for RX checksum offload.
Since AQC-100/107/108 chips supports hardware checksums for RX we should indicate this
via NETIF_F_RXCSUM flag.

v1->v2: 'Signed-off-by' tag added.

Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 19:40:52 -07:00
Lendacky, Thomas f43feef4e6 amd-xgbe: Fix the ECC-related bit position definitions
The ECC bit positions that describe whether the ECC interrupt is for
Tx, Rx or descriptor memory and whether the it is a single correctable
or double detected error were defined in incorrectly (reversed order).
Fix the bit position definitions for these settings so that the proper
ECC handling is performed.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 19:39:58 -07:00
VSR Burru 6069f3fbde liquidio: fix tx completions in napi poll
If there are no egress packets pending, then don't look for tx completions
in napi poll.  Also, fix broken tx queue wakeup logic.

Signed-off-by: VSR Burru <veerasenareddy.burru@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 19:36:44 -07:00
Satanand Burla 031d4f1210 liquidio: allocate RX buffers in OOM conditions in PF and VF
Add workqueue that is periodically run to try to allocate RX buffers in OOM
conditions in PF and VF.

Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 19:36:43 -07:00
Dan Carpenter c04ca616ee sfc: cleanup a condition in efx_udp_tunnel_del()
Presumably if there is an "add" function, there is also a "del"
function.  But it causes a static checker warning because it looks like
a common cut and paste bug.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 19:25:02 -07:00
Iyappan Subramanian 1ffa8a7aa5 drivers: net: xgene-v2: misc fixes
Fixed review comments from the previous patch-set.

- changed return value check of platform_get_irq() to < 0
- replaced devm_request(free)_irq() calls by request(free)_irq() since
  they are called from open() and close()
- changed sizeof(struct mystruct) to sizeof(*mystruct)
- reduced indentation on tx_timeout()

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 19:17:14 -07:00
Iyappan Subramanian b2180a8f28 drivers: net: xgene-v2: Fix port reset
Fixed port reset sequence by adding ECC init.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 19:17:14 -07:00
Iyappan Subramanian 617d795c7c drivers: net: xgene-v2: Add ethtool support
Added basic ethtool support.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 19:17:14 -07:00
Iyappan Subramanian ea8ab16ab2 drivers: net: xgene-v2: Add MDIO support
Added phy management support by using phy abstraction layer APIs.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 19:17:13 -07:00
Joao Pinto b4f0a66155 net: stmmac: fix dma operation mode config for older versions
The dma operation mode configuration routine was wrongly moved to a
function (stmmac_mtl_configuration) that is only executed if the
core version is >= 4.00.

Fixes: 6deee2221e ("net: stmmac: prepare dma op mode config for multiple queues")
Reported-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Reviewed-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Joao Pinto <jpinto@synopsys.com>
Tested-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 15:35:03 -07:00
Jakub Kicinski ac0488ef59 nfp: disable FW on reconfiguration errors
Since we no longer need to keep the FW enabled for .ndo_close()
to work we can always stop FW after reconfiguration failure.
This seems to make most FWs more resilient to faults (at least
in error injection scenarios).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 12:59:09 -07:00