Commit Graph

66693 Commits

Author SHA1 Message Date
Or Gerlitz e753b2b50d net/mlx5: Add helper to initialize a flow steering actions struct instance
There are bunch of places in the code where the intermediate struct
that keeps the elements related to flow actions is initialized with
the same default values. Put that into a small DECLARE type helper.

This patch doesn't change any functionality.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-03-28 15:34:01 +03:00
Or Gerlitz aa0cbbae5d net/mlx5e: Properly deal with resource cleanup when adding TC flow fails
The code for adding tc fdb flows leaves things half set when it fails
in the middle. Currently we are not leaking things (e.g eswitch
vlan reference, encap reference and HW resources) since the main
code to add flower rules does a cleanup by calling mlx5e_tc_del_flow().

This cleanup further works just b/c we're checking there if the HW rule
for the flow we are attempting to delete is valid before touching it, and
since under the current possible combinations of supported actions it's okay
to go and blidnly deref or delete all the action related resources (encap, vlan).

Instead, do things properly, namely make sure that if add flow fails we
clean all what was allocated or referenced. Now, the flow delete code can
blindly deref/deallocate both the rule and the actions related resources and
when more action combinations are introduced (such as the upcoming header
re-write) we are fine with clear and robust code.

While here, align all of nic/fdb parse actions/add flow functions to get
mlx5e_tc_flow struct param and pick the attributes or whatever else needed
from there.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-03-28 15:34:00 +03:00
Or Gerlitz 17091853fc net/mlx5e: Add intermediate struct for TC flow parsing attributes
Add intermediate structure to store attributes parsed from TC filter
matching/actions parts which are soon to be configured into the HW.

Currently put there the flow matching spec after being parsed. More
content to be added in down-stream patch.

This patch doesn't change any functionality.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-03-28 15:33:59 +03:00
Or Gerlitz 3bc4b7bfa0 net/mlx5e: Add NIC attributes for offloaded TC flows
Add structure that contains the attributes related to offloaded
NIC flows. Currently it has the actions and flow tag.

While here, do xmas tree cleanup of the TC configure function.

This patch doesn't change any functionality.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-03-28 15:33:58 +03:00
Or Gerlitz ecf5bb796b net/mlx5e: Add prefix for e-switch offloaded TC flow attributes
Add esw_ prefix to the flow attributes attached to offloaded e-switch
TC flows. This is a pre-step to add attributes to offloaded NIC TC flows.

Also, save one pointer space by using gcc's zero size array, this would
be beneficial for environments where 100Ks (or Ms) of flows are offloaded.

This patch doesn't change any functionality.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-03-28 15:33:57 +03:00
David S. Miller cc628c9680 mlx5e-failsafe 27-03-2017
This series provides a fail-safe mechanism to allow safely re-configuring
 mlx5e netdevice and provides a resiliency against sporadic
 configuration failures.
 
 To enable this we do some refactoring and code reorganizing to allow
 breaking the drivers open/close flows to stages:
       open -> activate -> deactivate -> close.
 
 In addition we need to allow creating fresh HW ring resources
 (mlx5e_channels) with their own "new" set of parameters, while keeping
 the current ones running and active until the new channels are
 successfully created with the new configuration, and only then we can
 safly replace (switch) old channels with new ones.
 
 For that we introduce mlx5e_channels object and an API to manage it:
  - channels = open_channels(new_params):
    open fresh TX/RX channels
  - activate_channels(channels):
    redirect traffic to them and attach them to the netdev
  - deactivate_channes(channels)
    stop traffic and detach from netdev
  - close(channels)
    Free the TX/RX HW resources of those channels
 
 With the above strategy it is straightforward to achieve the desired
 behavior of fail-safe configuration.  In pseudo code:
 
 make_new_config(new_params)
 {
 	old_channels = current_active_channels;
 	new_channels = create_channels(new_params);
 	if (!new_channels)
 		return "Failed, but current channels are still active :)"
 
 	deactivate_channels(old_channels); /* Can't fail */
 	set_hw_new_state();                /* If needed  */
 	activate_channels(new_channels);   /* Can't fail */
 	close_channels(old_channels);
 	current_active_channels = new_channels;
 
         return "SUCCESS";
 }
 
 At the top of this series, we change the following flows to be fail-safe:
 ethtool:
    - ring parameters
    - coalesce parameters
    - tx copy break parameters
    - cqe compressing/moderation mode setting (priv flags)
 ndos:
    - tc setup
    - set features: LRO
    - change mtu
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJY2XfKAAoJEEg/ir3gV/o+6fAIAKBsqf+EYhbHA0JoTnV1sm3G
 PSGjj5VCMNTZPyDlTWLEpY2S5TIDRPvICC04i5jWFjo5SOmsRMR6ZV0llHukKC4k
 SAkAYU4A78Ds7UhmWzokebwzWa8VA48eqLRxXV60EAhJ0BOgzZnG09KIpzdplE7A
 pco+F/c/qzJa0NP1KQBBrYIcXbGMrCFcYM8d6lJ8TRfVDdZZpeTB/wvxRixKfe1L
 Ji6+k5tbDynDD3+HWkWq+chAkw4yldN7q8fC8FaN2r0mtWYsYbVSPuP+BlL0XN4R
 oluZEJjnyaCePaqUMW+ZYVb1hCGP7pOoJkBb901XdOnX5M2fU9vK3VufWErYF/s=
 =r6Qw
 -----END PGP SIGNATURE-----

Merge tag 'mlx5e-failsafe' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5e-failsafe 27-03-2017

This series provides a fail-safe mechanism to allow safely re-configuring
mlx5e netdevice and provides a resiliency against sporadic
configuration failures.

To enable this we do some refactoring and code reorganizing to allow
breaking the drivers open/close flows to stages:
      open -> activate -> deactivate -> close.

In addition we need to allow creating fresh HW ring resources
(mlx5e_channels) with their own "new" set of parameters, while keeping
the current ones running and active until the new channels are
successfully created with the new configuration, and only then we can
safly replace (switch) old channels with new ones.

For that we introduce mlx5e_channels object and an API to manage it:
 - channels = open_channels(new_params):
   open fresh TX/RX channels
 - activate_channels(channels):
   redirect traffic to them and attach them to the netdev
 - deactivate_channes(channels)
   stop traffic and detach from netdev
 - close(channels)
   Free the TX/RX HW resources of those channels

With the above strategy it is straightforward to achieve the desired
behavior of fail-safe configuration.  In pseudo code:

make_new_config(new_params)
{
	old_channels = current_active_channels;
	new_channels = create_channels(new_params);
	if (!new_channels)
		return "Failed, but current channels are still active :)"

	deactivate_channels(old_channels); /* Can't fail */
	set_hw_new_state();                /* If needed  */
	activate_channels(new_channels);   /* Can't fail */
	close_channels(old_channels);
	current_active_channels = new_channels;

        return "SUCCESS";
}

At the top of this series, we change the following flows to be fail-safe:
ethtool:
   - ring parameters
   - coalesce parameters
   - tx copy break parameters
   - cqe compressing/moderation mode setting (priv flags)
ndos:
   - tc setup
   - set features: LRO
   - change mtu
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-27 21:16:03 -07:00
Mahesh Bandewar e292dcae16 bonding: avoid printing while holding a spinlock
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-27 21:11:50 -07:00
Mahesh Bandewar b5bf0f5b16 bonding: correctly update link status during mii-commit phase
bond_miimon_commit() marks the link UP after attempting to get the speed
and duplex settings for the link. There is a possibility that
bond_update_speed_duplex() could fail. This is another place where it
could result into an inconsistent bonding link state.

With this patch the link will be marked UP only if the speed and duplex
values retrieved have sane values and processed further.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-27 21:11:50 -07:00
Mahesh Bandewar c4adfc822b bonding: make speed, duplex setting consistent with link state
bond_update_speed_duplex() retrieves speed and duplex settings. There
is a possibility of failure in retrieving these values but caller has
to assume it's always successful. This leads to having inconsistent
slave link settings. If these (speed, duplex) values cannot be
retrieved, then keeping the link UP causes problems.

The updated bond_update_speed_duplex() returns 0 on success if it
retrieves sane values for speed and duplex. On failure it returns 1
and marks the link down.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-27 21:11:50 -07:00
Mahesh Bandewar de77ecd4ef bonding: improve link-status update in mii-monitoring
The primary issue is that mii-inspect phase updates link-state and
expects changes to be committed during the mii-commit phase. After
the inspect phase if it fails to acquire rtnl-mutex, the commit
phase (bond_mii_commit) doesn't get to run. This partially updated
state stays and makes the internal-state inconsistent.

e.g. setup bond0 => slaves: eth1, eth2
eth1 goes DOWN -> UP
   mii_monitor()
	mii-inspect()
	    bond_set_slave_link_state(eth1, UP, DontNotify)
	rtnl_trylock() <- fails!

Next mii-monitor round
eth1: No change
   mii_monitor()
	mii-inspect()
	    eth1->link == current-status (ethtool_ops->get_link)
	    no-change-detected

End result:
    eth1:
      Link = BOND_LINK_UP
      Speed = 0xfffff  [SpeedUnknown]
      Duplex = 0xff    [DuplexUnknown]

This doesn't always happen but for some unlucky machines in a large set
of machines it creates problems.

The fix for this is to avoid making changes during inspect phase and
postpone them until acquiring the rtnl-mutex / invoking commit phase.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-27 21:11:49 -07:00
Jacob Keller 7be147dc14 i40e: initialize params before notifying of l2_param_changes
Probably due to some mis-merging fix a bug associated with commits
d7ce6422d6 ("i40e: don't check params until after checking for client
instance", 2017-02-09) and 3140aa9a78c9 ("i40e: KISS the client
interface", 2017-03-14)

The first commit tried to move the initialization of the params
structure so that we didn't bother doing this if we didn't have a client
interface. You can already see that it looks fishy because of the
indentation. The second commit refactors a bunch of the interface, and
incorrectly drops the params initialization.

I believe what occurred is that internally the two patches were
re-ordered, and the merge conflicts as a result were performed
incorrectly.

Fix the use of an uninitialized variable by correctly initializing the
params variable via i40e_client_get_params().

Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:47:44 -07:00
Colin Ian King 703ba88548 i40evf: dereference VSI after VSI has been null checked
VSI is being dereferenced before the VSI null check; if VSI is
null we end up with a null pointer dereference.  Fix this by
performing VSI deference after the VSI null check.  Also remove
the need for using adapter by using vsi->back->cinst.

Detected by CoverityScan, CID#1419696, CID#1419697
("Dereference before null check")

Fixes: ed0e894de7 ("i40evf: add client interface")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:47:44 -07:00
Alexander Duyck c76cb6ed54 i40e: Drop FCoE code that always evaluates to false or 0
Since FCoE isn't supported by the i40e products there isn't much point in
carrying around code that will always evaluate to false. This patch goes
through and strips out the code in several spots so that we don't go around
carrying variables and/or code that is always going to evaluate to false or
0.

Change-ID: I39d1d779c66c638b75525839db2b6208fdc809d7
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:47:44 -07:00
Alexander Duyck 9eed69a914 i40e: Drop FCoE code from core driver files
Looking over the code for FCoE it looks like the Rx path has been broken at
least since the last major Rx refactor almost a year ago.  It seems like
FCoE isn't supported for any of the Fortville/Fortpark hardware so there
isn't much point in carrying the code around, especially if it is broken
and untested.

Change-ID: I892de8fa551cb129ce2361e738ff82ce55fa229e
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:47:43 -07:00
Alexander Duyck a5b268e4b1 i40e/i40evf: Clean-up process_skb_fields
This is a minor clean-up to make the i40e/i40evf process_skb_fields
function look a little more like what we have in igb.  The Rx checksum
function called out a need for skb->protocol but I can't see where it
actually needs it.  I am assuming this is something that was likely
refactored out some time ago as the Rx checksum code has gone through a few
rewrites.

Change-ID: I0b4668a34d90b61b66ded7c7c26e19a3e2d06251
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:47:43 -07:00
Bimmy Pujari 0a25b7311d i40e: removed no longer needed delays
Removed no longer needed delays.  At preproduction stage those delays were
needed but now these delays are not needed.

Signed-off-by: Bimmy Pujari <bimmy.pujari@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:47:43 -07:00
Robert Konklewski beff3e9d80 i40e: Fixed race conditions in VF reset
First, this patch eliminates IOMMU DMAR Faults caused by VF hardware.
This is done by enabling VF hardware only after VSI resources are
freed. Otherwise, hardware could DMA into memory that is (or just has
been) being freed.

Then, the VF driver is activated only after VSI resources have been
reallocated. That's because the VF driver can request resources
immediately after it's activated. So they need to be ready at that
point.

The second race condition happens when the OS initiates a VF reset,
and then before it's finished modifies VF's settings by changing its
MAC, VLAN ID, bandwidth allocation, anti-spoof checking, etc. These
functions needed to be blocked while VF is undergoing reset. Otherwise,
they could operate on data structures that had just been freed or not
yet fully initialized.

Change-ID: I43ba5a7ae2c9a1cce3911611ffc4598ae33ae3ff
Signed-off-by: Robert Konklewski <robertx.konklewski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:45:14 -07:00
Alexander Duyck 741b8b832a i40e/i40evf: Fix use after free in Rx cleanup path
We need to reset skb back to NULL when we have freed it in the Rx cleanup
path.  I found one spot where this wasn't occurring so this patch fixes it.

Change-ID: Iaca68934200732cd4a63eb0bd83b539c95f8c4dd
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:45:14 -07:00
Harshitha Ramamurthy f25571b576 i40e: fix configuration of RSS table with DCB
There exists a bug in the driver where the calculation of the
RSS size was not taking into account the number of traffic classes
enabled. This patch factors in the traffic classes both in
the initial configuration of the table as well as reconfiguration.

Change-ID: I34dcd345ce52faf1d6b9614bea28d450cfd5f621
Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:45:14 -07:00
Alexander Duyck 1793668c3b i40e/i40evf: Update code to better handle incrementing page count
Update the driver code so that we do bulk updates of the page reference
count instead of just incrementing it by one reference at a time.  The
advantage to doing this is that we cut down on atomic operations and
this in turn should give us a slight improvement in cycles per packet.
In addition if we eventually move this over to using build_skb the gains
will be more noticeable.

I also found and fixed a store forwarding stall from where we were
assigning "*new_buff = *old_buff".  By breaking it up into individual
copies we can avoid this and as a result the performance is slightly
improved.

Change-ID: I1d3880dece4133eca3c32423b04a5467321ccc52
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:45:13 -07:00
Tobias Klauser 656455bf19 net: ibmvnic: Remove unused net_stats member from struct ibmvnic_adapter
The ibmvnic driver keeps its statistics in net_device->stats, so the
net_stats member in struct ibmvnic_adapter is unused. Remove it.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-27 16:02:00 -07:00
Tobias Klauser 8c2ef1978f net: ibmveth: Remove unused stats member from struct ibmveth_adapter
The ibmveth driver keeps its statistics in net_device->stats, so the
stats member in struct ibmveth_adapter is unused. Remove it.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-27 16:02:00 -07:00
Tobias Klauser 8570bcd83d net: bfin_mac: Remove unused stats member from struct bfin_mac_local
The bfin_mac driver keeps its statistics in net_device->stats, so the
stats member in struct bfin_mac_local is unused. Remove it, as well as
the accompanying comment.

Cc: adi-buildroot-devel@lists.sourceforge.net
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-27 16:01:59 -07:00
Colin Ian King eb996edb03 netvsc: fix dereference before null check errors
ndev is being checked to see if it is a null pointer however before
the null check ndev is being dereferenced; hence there is a potential
null pointer dereference bug that needs fixing. Fix this by only
dereferencing ndev after the null check.

Detected by CoverityScan, CID#1420760, CID#140761 ("Dereference
before null check")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-27 16:00:58 -07:00
Philippe Reynes 86573f6152 net: tehuti: use new api ethtool_{get|set}_link_ksettings
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-27 16:00:07 -07:00
Philippe Reynes bb37056807 net: cris: eth_v10: use new api ethtool_{get|set}_link_ksettings
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-27 15:53:19 -07:00
Saeed Mahameed 2e20a15120 net/mlx5e: Fail safe mtu and lro setting
Use the new fail-safe channels switch mechanism to set new
netdev mtu and lro settings.

MTU and lro settings demand some HW configuration changes after new
channels are created and ready for action. In order to unify switch
channels routine for LRO and MTU changes, and maybe future configuration
features, we now pass to it a modify HW function pointer to be
invoked directly after old channels are de-activated and before new
channels are activated.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:24 +03:00
Saeed Mahameed 6f9485af40 net/mlx5e: Fail safe tc setup
Use the new fail-safe channels switch mechanism to set up new
tc parameters.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:23 +03:00
Saeed Mahameed be7e87f92b net/mlx5e: Fail safe cqe compressing/moderation mode setting
Use the new fail-safe channels switch mechanism to set new
CQE compressing and CQE moderation mode settings.

We also move RX CQE compression modify function out of en_rx file  to
a more appropriate place.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:22 +03:00
Saeed Mahameed 546f18ed3f net/mlx5e: Fail safe ethtool settings
Use the new fail-safe channels switch mechanism to set new ethtool
settings:
 - ring parameters
 - coalesce parameters
 - tx copy break parameters

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:21 +03:00
Saeed Mahameed 55c2503dae net/mlx5e: Introduce switch channels
A fail safe helper functions that allows switching to new channels on the
fly,  In simple words:

make_new_config(new_params)
{
    new_channels = open_channels(new_params);
    if (!new_channels)
         return "Failed, but current channels are still active :)"

    switch_channels(new_channels);

    return "SUCCESS";
}

Demonstrate mlx5e_switch_priv_channels usage in set channels ethtool
callback and make it fail-safe using the new switch channels mechanism.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:20 +03:00
Saeed Mahameed 9008ae0748 net/mlx5e: Minimize mlx5e_{open/close}_locked
mlx5e_redirect_rqts_to_{channels,drop} and mlx5e_{add,del}_sqs_fwd_rules
and Set real num tx/rx queues belong to
mlx5e_{activate,deactivate}_priv_channels, for that we move those functions
and minimize mlx5e_open/close flows.

This will be needed in downstream patches to replace old channels with new
ones without the need to call mlx5e_close/open.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:19 +03:00
Saeed Mahameed a43b25daef net/mlx5e: CQ and RQ don't need priv pointer
Remove mlx5e_priv pointer from CQ and RQ structs,
it was needed only to access mdev pointer from priv pointer.

Instead we now pass mdev where needed.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:18 +03:00
Saeed Mahameed 6a9764efb2 net/mlx5e: Isolate open_channels from priv->params
In order to have a clean separation between channels resources creation
flows and current active mlx5e netdev parameters, make sure each
resource creation function do not access priv->params, and only works
with on a new fresh set of parameters.

For this we add "new" mlx5e_params field to mlx5e_channels structure
and use it down the road to mlx5e_open_{cq,rq,sq} and so on.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:18 +03:00
Saeed Mahameed acc6c5953a net/mlx5e: Split open/close channels to stages
As a foundation for safe config flow, a simple clear API such as
(Open then Activate) where the "Open" handles the heavy unsafe
creation operation and the "activate" will be fast and fail safe,
to enable the newly created channels.

For this we split the RQs/TXQ SQs and channels open/close flows to
open => activate, deactivate => close.

This will simplify the ability to have fail safe configuration changes
in downstream patches as follows:

make_new_config(new_params)
{
     old_channels = current_active_channels;
     new_channels = create_channels(new_params);
     if (!new_channels)
              return "Failed, but current channels still active :)"
     deactivate_channels(old_channels); /* Can't fail */
     activate_channels(new_channels); /* Can't fail */
     close_channels(old_channels);
     current_active_channels = new_channels;

     return "SUCCESS";
}

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:17 +03:00
Saeed Mahameed b676f65389 net/mlx5e: Refactor refresh TIRs
Rename mlx5e_refresh_tirs_self_loopback to mlx5e_refresh_tirs,
as it will be used in downstream (Safe config flow) patches, and make it
fail safe on mlx5e_open.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:16 +03:00
Saeed Mahameed a5f97fee74 net/mlx5e: Redirect RQT refactoring
RQ Tables are always created once (on netdev creation) pointing to drop RQ
and at that stage, RQ tables (indirection tables) are always directed to
drop RQ.

We don't need to use mlx5e_fill_{direct,indir}_rqt_rqns to fill the drop
RQ in create RQT procedure.

Instead of having separate flows to redirect direct and indirect RQ Tables
to the current active channels Receive Queues (RQs), we unify the two
flows by introducing mlx5e_redirect_rqt function and redirect_rqt_param
struct. Combined, they provide one generic logic to fill the RQ table RQ
numbers regardless of the RQ table purpose (direct/indirect).

Demonstrated the usage with mlx5e_redirect_rqts_to_channels which will
be called on mlx5e_open and with mlx5e_redirect_rqts_to_drop which will
be called on mlx5e_close.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:15 +03:00
Saeed Mahameed ff9c852f91 net/mlx5e: Introduce mlx5e_channels
Have a dedicated "channels" handler that will serve as channels
(RQs/SQs/etc..) holder to help with separating channels/parameters
operations, for the downstream fail-safe configuration flow, where we will
create a new instance of mlx5e_channels with the new requested parameters
and switch to the new channels on the fly.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:14 +03:00
Saeed Mahameed be4891af77 net/mlx5e: Set netdev->rx_cpu_rmap on netdev creation
To simplify mlx5e_open_locked flow we set netdev->rx_cpu_rmap on netdev
creation rather on netdev open, it is redundant to set it every time on
mlx5e_open_locked.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:13 +03:00
Saeed Mahameed 7f859ecfa8 net/mlx5e: Set SQ max rate on mlx5e_open_txqsq rather on open_channel
Instead of iterating over the channel SQs to set their max rate, do it
on SQ creation per TXQ SQ.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:12 +03:00
K. Y. Srinivasan 386f57622c netvsc: Properly initialize the return value
Initialize the return value correctly.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-25 20:15:56 -07:00
K. Y. Srinivasan b1dd90cea7 netvsc: Fix a bug in sub-channel handling
All netvsc channels are handled via NAPI. Setup the "read mode" correctly
for the netvsc sub-channels.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-25 20:15:56 -07:00
Jonas Bonn 91ed81f9ab gtp: support SGSN-side tunnels
The GTP-tunnel driver is explicitly GGSN-side as it searches for PDP
contexts based on the incoming packets _destination_ address.  If we
want to place ourselves on the SGSN side of the  tunnel, then we want
to be identifying PDP contexts based on _source_ address.

Let it be noted that in a "real" configuration this module would never
be used:  the SGSN normally does not see IP packets as input.  The
justification for this functionality is for PGW load-testing applications
where the input to the SGSN is locally generally IP traffic.

This patch adds a "role" argument at GTP-link creation time to specify
whether we are on the GGSN or SGSN side of the tunnel; this flag is then
used to determine which part of the IP packet to use in determining
the PDP context.

Signed-off-by: Jonas Bonn <jonas@southpole.se>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Harald Welte <laforge@gnumonks.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-25 20:11:19 -07:00
Jonas Bonn ae6336b57e gtp: rename SGSN netlink attribute
This is a mostly cosmetic rename of the SGSN netlink attribute to
the GTP link.  The justification for this is that we will be making
the module support decapsulation of "downstream" SGSN packets, in
which case the netlink parameter actually refers to the upstream GGSN
peer.  Renaming the parameter makes the relationship clearer.

The legacy name is maintained as a define in the header file in order
to not break existing code.

Signed-off-by: Jonas Bonn <jonas@southpole.se>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Harald Welte <laforge@gnumonks.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-25 20:11:19 -07:00
Daniele Palmas c6adf77953 net: usb: qmi_wwan: add qmap mux protocol support
This patch adds support for qmap mux protocol available in recent
Qualcomm based modems.

The qmap mux protocol can be used for multiplexing data packets in
order to have multiple ip streams through the same physical device.

Two new sysfs files are added for adding/removing the qmap mux based
interfaces (named qmimux):

- /sys/class/net/<iface>/qmi/add_mux
- /sys/class/net/<iface>/qmi/del_mux

Main patch author is Bjørn Mork <bjorn@mork.no>

Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-25 20:03:35 -07:00
Arkadi Sharshevsky 1312444374 mlxsw: spectrum_kvdl: Cosmetic kvdl allocator API change
Currently the return allocated index and err value are multiplexed.
This patch changes the API to decouple the ret value from the allocated
index.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-25 19:56:15 -07:00
Saeed Mahameed 3139104861 net/mlx5e: Different SQ types
Different SQ types (tx, xdp, ico) are growing apart, we separate them
and remove unwanted parts in each one of them, to simplify data path and
utilize data cache.

Remove DB union from SQ structures since it is not needed anymore as we
now have different SQ data type for each SQ.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:46 -07:00
Saeed Mahameed 33ad971186 net/mlx5e: Generalize SQ create/modify/destroy functions
In the next patches we will introduce different SQ types,
and we would want to reuse those functions, in this patch we make them
agnostic to SQ type (txq, xdp, ico).

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:46 -07:00
Saeed Mahameed 3b77235b94 net/mlx5e: Proper names for SQ/RQ/CQ functions
Rename mlx5e_{create,destroy}_{sq,rq,cq} to
mlx5e_{alloc,free}_{sq,rq,cq}.

Rename mlx5e_{enable,disable}_{sq,rq,cq} to
mlx5e_{create,destroy}_{sq,rq,cq}.

mlx5e_{enable,disable}_{sq,rq,cq} used to actually create/destroy the SQ
in FW, so we rename them to align the functions names with FW semantics.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:46 -07:00
Saeed Mahameed 864b2d7153 net/mlx5e: Generalize tx helper functions for different SQ types
In the next patches we will introduce different SQ types, for that we here
generalize some TX helper functions to work with more basic SQ parameters,
in order to re-use them for the different SQ types.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:46 -07:00
Saeed Mahameed 2239185ccd net/mlx5e: Optimize XDP frame xmit
XDP SQ has a fixed size WQE (MLX5E_XDP_TX_WQEBBS = 1) and only posts
one kind of WQE (MLX5_OPCODE_SEND),

Also we initialize SQ descriptors static fields once on open_xdpsq,
rather than every time on critical path.

Optimize the code in light of those facts and add a prefetch of the TX
descriptor first thing in the xdp xmit function.

Performance improvement:
System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz

Test case              Before     Now        improvement
---------------------------------------------------------------
XDP TX   (1 core)      13Mpps    13.7Mpps       5%

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:46 -07:00
Saeed Mahameed 39e12351a3 net/mlx5e: Poll XDP TX CQ before RX CQ
Handle XDP TX completions before handling RX packets, to make sure more
free space is available for XDP TX packets a moment before handling
RX packets.

Performance improvement:
System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz

Test case              Before     Now      improvement
---------------------------------------------------------------
XDP Drop (1 core)      16.9Mpps  16.9Mpps    No change
XDP TX   (1 core)      12Mpps    13Mpps      8%

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:46 -07:00
Saeed Mahameed 31871f87bb net/mlx5e: Move XDP SQ instance into RQ
To save many rq->channel->sq dereferences in fast-path.
And rename it to xdpsq.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:45 -07:00
Saeed Mahameed eba2db2bd2 net/mlx5e: Move mlx5e_rq struct declaration
Move struct mlx5e_rq and friends to appear after mlx5e_sq declaration in
en.h.

We will need this for next patch to move the mlx5e_sq instance into
mlx5e_rq struct for XDP SQs.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:45 -07:00
Saeed Mahameed 1c4bf94045 net/mlx5e: Move XDP completion functions to rx file
XDP code belongs to RX path, move mlx5e_poll_xdp_tx_cq and
mlx5e_free_xdp_tx_descs to en_rx.c.

Rename them to mlx5e_poll_xdpsq_cq and mlx5e_free_xdpsq_descs.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:45 -07:00
Saeed Mahameed aff2615763 net/mlx5e: Single bfreg (UAR) for all mlx5e SQs and netdevs
One is sufficient since Blue Flame is not supported anymore.
This will also come in handy for switchdev mode to save resources, since
VF representors will use same single UAR as well for their own SQs.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:45 -07:00
Saeed Mahameed 6982ab6097 net/mlx5e: Xmit, no write combining
mlx5e netdev Blue Flame (write combining) support demands a lot of
overhead for a little latency gain for some special cases, this overhead
is hurting the common case.

Here we remove xmit Blue Flame support by creating all bfregs with no
write combining for all SQs, and we remove a lot of BF logic and
conditions from xmit data path.

Simplify mlx5e_tx_notify_hw (doorbell function) by removing BF related
code and by removing one memory barrier needed for WC mapped SQ doorbell
buffers, which no longer exist.

Performance improvement:
System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz

Test case                   Before      Now      improvement
---------------------------------------------------------------
TX packets (24 threads)     50Mpps      54Mpps    8%

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:45 -07:00
Saeed Mahameed 80fe326ab8 net/mlx5e: Use dma_rmb rather than rmb in CQE fetch routine
Use dma_rmb in mlx5e_get_cqe rather than aggressive rmb (at least on
some architectures), this should help improve the performance on such
CPU archs where dma_rmb is optimized.

Performance improvement:
System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz

Test case                   Baseline      Now      improvement
---------------------------------------------------------------
TX packets (24 threads)     45Mpps        50Mpps      11%
TC stack Drop (1 core)      3.45Mpps      3.6Mpps     5%
XDP Drop      (1 core)      14Mpps        16.9Mpps    20%
XDP TX        (1 core)      10.4Mpps      12Mpps      15%

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:44 -07:00
Florian Fainelli 68e498554f net: dsa: bcm_sf2: Add missing OF_MDIO dependency
bcm_sf2 does require the MDIO_BCM_UNIMAC driver which is now dependent
on OF_MDIO but also internally uses of_mdio.c provided routines which
are guarted with OF_MDIO.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 90eff9096c ("net: phy: Allow splitting MDIO bus/device support from PHYs")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 15:03:06 -07:00
Ido Schimmel 18281f2dab mlxsw: spectrum: Query cell size from firmware
As explained in the previous patch, the cell size may change in future
devices, so query it from the firmware instead of hard coding it.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 13:53:29 -07:00
Ido Schimmel f417f04da5 mlxsw: spectrum: Refactor port buffer configuration
The sizes and thresholds of the priority group (PG) buffers are
configured in cells, which represent a specific amount of bytes.

The cell size can vary in different devices, so it's better to query it
from the firmware than hard coding it.

Refactor the code dealing with this value into different functions, so
that it will be easier to make the conversion in the next patch.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 13:53:29 -07:00
Ido Schimmel d3daae1b08 mlxsw: spectrum_buffers: Query shared buffer size from firmware
Instead of hard coding the size of the shared buffer in the driver,
query it from the firmware, as it may change in future devices.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 13:53:28 -07:00
Ido Schimmel 5ec2ee7dd2 mlxsw: Query maximum number of ports from firmware
We currently hard code the maximum number of ports in the driver, but
this may change in future devices, so query it from the firmware
instead.

Fallback to a maximum of 64 ports in case this number can't be queried.
This should only happen in SwitchX-2 for which this number is correct.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 13:53:28 -07:00
Ido Schimmel 8494ab06e0 mlxsw: spectrum_router: Query number of LPM trees from firmware
Instead of hard coding the number of LPM trees in the driver, query it
from the firmware, as it may change in future devices.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 13:53:28 -07:00
David S. Miller ba82427d4a Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2017-03-23

This series contains updates to i40e and i40e.txt documentation.

Jake provides all the changes in the series which are centered around
ntuple filter fixes and additional support.  Fixed the current
implementation of .set_rxnfc, where we were not reading the mask field
for filter entries which was resulting in filters not behaving as
expected and not working correctly.  When cleaning up after disabling
flow director support, ensure that the default input set is correctly
reprogrammed.  Since the hardware only supports a single input set for
all flows of that type, the driver shall only allow the input set to
change if there are no other configured filters for that flow type, so
add support to detect when we can update the input set for each flow
type.  Align the driver to other drivers to partition the ring_cookie
value into 8bits of VF index, along with 32bits of queue number instead
of using the user-def field.  Added support to parse the user-def field
into a data structure format to allow future extensions of the user-def
filed by keeping all the code that read/writes the field into a single
location.  Added support for flexible payloads passed via ethtool
user-def field.  We support a single flexible word (2byte) value per
protocol type, and we handle the FLX_PIT register using a list of
flexible entries so that each flow type may be configured separately.
Enabled flow director filters for SCTPv4 packets using the ethtool
ntuple interface to enable filters.  Updated the documentation on the
i40e driver to include the newly added support to ntuple filters.
Reduced complexity of a if-continue-else-break section of code by
taking advantage of using hlist_for_each_entry_continue() instead.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 13:45:07 -07:00
Felix Manlunas 7cc61db9c7 liquidio: do not reset Octeon if NIC firmware was preloaded
The PF driver is incorrectly resetting Octeon when the module parameter
"fw_type=none" is there.  "fw_type=none" means the PF should not load any
firmware to the NIC because Octeon is already running preloaded firmware.

Fix it by putting an if (fw_type != none) around the reset code.

Because the Octeon reset is now conditionally gone, when unloading the
driver, conditionally send the RESET_PF command to the firmware who will
then free up PF-related data structures.

Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 13:20:43 -07:00
Florian Fainelli e9d7af78b2 net: systemport: Simplify circular pointer arithmetic
Similar to c298ede2fe ("net: bcmgenet: simplify circular pointer
arithmetic") we don't need to complex arthimetic since we always have a
ring size that is a power of 2.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 12:53:15 -07:00
Florian Fainelli 6baa785a9c net: systemport: Clear status to reduce spurious interrupts
Do something similar to commit d5810ca325 ("net: bcmgenet: clear
status to reduce spurious interrupts") and clear interrupts right before
servicing them. This reduces the number of interrupts by 10K
interrupts/sec for a TX TCP session 1Gbits/sec.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 12:53:14 -07:00
Florian Fainelli 30defeb2fb net: systemport: Track per TX ring statistics
bcm_sysport_tx_reclaim_one() is currently summing TX bytes/packets in a
way that is not SMP friendly, mutliples CPUs could run
bcm_sysport_tx_reclaim_one() independently and still update
stats->tx_bytes and stats->tx_packets, cloberring the other CPUs
statistics.

Fix this by tracking per TX rings the number of bytes, packets,
dropped and errors statistics, and provide a bcm_sysport_get_nstats()
function which aggregates everything and returns a consistent output.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 12:53:14 -07:00
Florian Fainelli 90eff9096c net: phy: Allow splitting MDIO bus/device support from PHYs
Introduce a new configuration symbol: MDIO_DEVICE which allows building
the MDIO devices and bus code, without pulling in the entire Ethernet
PHY library and devices code.

PHYLIB nows select MDIO_DEVICE and the relevant Makefile files are
updated to reflect that.

When MDIO_DEVICE (MDIO bus/device only) is selected, but not PHYLIB, we
have mdio-bus.ko as a loadable module, and it does not have a
module_exit() function because the safety of removing a bus class is
unclear.

When both MDIO_DEVICE and PHYLIB are enabled, we need to assemble
everything into a common loadable module: libphy.ko because of nasty
circular dependencies between phy.c, phy_device.c and mdio_bus.c which
are really tough to untangle.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 12:51:05 -07:00
Florian Fainelli 17487eebaf net: phy: MDIO_BCM_UNIMAC should depend on OF_MDIO
The Broadcom MDIO UniMAC driver uses routines provided by of_mdio.c which is
guarded by CONFIG_OF_MDIO.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 12:51:04 -07:00
LABBE Corentin 270c7759fb net: stmmac: add set_mac to the stmmac_ops
Two different set_mac functions exists but stmmac_dwmac4_set_mac() is
only used for enabling and never for disabling.
So on dwmac4, the MAC RX/TX is never disabled.

This patch add a generic function pointer set_mac() to stmmac_ops and
replace all call to stmmac_set_mac/stmmac_dwmac4_set_mac by a call to
this pointer.

Since dwmac4_ops is const, set_mac cannot be modified after, and so dwmac4_ops
is duplioacted like dwmac4_dma_ops.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 12:36:42 -07:00
Ido Schimmel 9a32562bec mlxsw: Remove debugfs interface
We don't use it during development and we can't extend it either, so
remove it.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 21:29:32 -07:00
Jacob Keller 584a88709b i40e: make use of hlist_for_each_entry_continue
Replace a complex if->continue->else->break construction in
i40e_next_filter. We can simply use hlist_for_each_entry_continue
instead. This drops a lot of confusing code. The resulting code is much
easier to understand the intention, and follows the more normal pattern
for using hlist loops. We could have also used a break with a "return
next" at the end of the function, instead of return NULL, but the
current implementation is explicitly clear that when you reach the end
of the loop you get a NULL value. The alternative construction is less
clear since the reader would have to know that next is NULL at the end
of the loop.

Change-Id: Ife74ca451dd79d7f0d93c672bd42092d324d4a03
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-23 21:13:33 -07:00
Jacob Keller f223c8752a i40e: add support for SCTPv4 FDir filters
Enable FDir filters for SCTPv4 packets using the ethtool ntuple
interface to enable filters. The ethtool API does not allow masking on
the verification tag.

Change-Id: I093e88a8143994c7e6f4b7b17a0bd5cf861d18e4
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-23 21:13:33 -07:00
Jacob Keller 0e588de17f i40e: implement support for flexible word payload
Add support for flexible payloads passed via ethtool user-def field.
This support is somewhat limited due to hardware design. The input set
can only be programmed once per filter type, and the flexible offset is
part of this filter input set. This means that the user cannot program
both a regular and a flexible filter at the same time for a given flow
type. Additionally, the user may not program two flexible filters of the
same flow type with different offsets, although they are allowed to
configure different values at that offset location.

We support a single flexible word (2byte) value per protocol type, and
we handle the FLX_PIT register using a list of flexible entries so that
each flow type may be configured separately.

Due to hardware implementation, the flexible data is offset from the
start of the packet payload, and thus may not be in part of the header
data. For this reason, the offset provided by the user defined data is
interpreted as a byte offset from the start of the matching payload.
Previous implementations have tried to represent the offset as from the
start of the frame, but this is not feasible because header sizes may
change due to options.

Change-Id: 36ed27995e97de63f9aea5ade5778ff038d6f811
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-23 21:13:33 -07:00
Jacob Keller e793095e8a i40e: add parsing of flexible filter fields from userdef
Add code to parse the user-def field into a data structure format. This
code is intended to allow future extensions of the user-def field by
keeping all code that actually reads and writes the field into a single
location. This ensures that we do not litter the driver with references
to the user-def field and minimizes the amount of bitwise operations we
need to do on the data.

Add code which parses the lower 32bits into a flexible word and its
offset. This will be used in a future patch to enable flexible filters
which can match on some arbitrary data in the packet payload. For now,
we just return -EOPNOTSUPP when this is used.

Add code to fill in the user-def field when reporting the filter back,
even though we don't actually implement any user-def fields yet.

Additionally, ensure that we mask the extended FLOW_EXT bit from the
flow_type now that we will be accepting filters which have the FLOW_EXT
bit set (and thus make use of the user-def field).

Change-Id: I238845035c179380a347baa8db8223304f5f6dd7
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-23 21:13:33 -07:00
Jacob Keller 43b15697a3 i40e: partition the ring_cookie to get VF index
Do not use the user-def field for determining the VF target. Instead,
similar to ixgbe, partition the ring_cookie value into 8bits of VF
index, along with 32bits of queue number. This is better than using the
user-def field, because it leaves the field open for extension in
a future patch which will enable flexible data. Also, this matches with
convention used by ixgbe and other drivers.

Change-Id: Ie36745186d817216b12f0313b99ec95cb8a9130c
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-23 21:13:33 -07:00
Jacob Keller 9229e99334 i40e: allow changing input set for ntuple filters
Add support to detect when we can update the input set for each flow
type.

Because the hardware only supports a single input set for all flows of
that matching type, the driver shall only allow the input set to change
if there are no other configured filters for that flow type.

Thus, the first filter added for each flow type is allowed to change the
input set, and all future filters must match the same input set. Display
a diagnostic message whenever the filter input set changes, and
a warning whenever a filter cannot be accepted because it does not match
the configured input set.

Change-Id: Ic22e1c267ae37518bb036aca4a5694681449f283
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-23 21:13:33 -07:00
Jacob Keller 3bcee1e653 i40e: restore default input set for each flow type
Ensure that the default input set is correctly reprogrammed when
cleaning up after disabling flow director support. This ensures that the
programmed value will be in a clean state.

Although we do not yet have support for SCTPv4 filters, a future patch
will add support for this protocol, so we will correctly restore the
SCTPv4 input set here as well. Note that strictly speaking the default
hardware value for SCTP includes matching the verification tag. However,
the ethtool API does not have support for specifying this value, so
there is no reason to keep the verification field enabled.

This patch is the next step on the way to enabling partial tuple filters
which will be implemented in a following patch.

Change-Id: Ic22e1c267ae37518bb036aca4a5694681449f283
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-23 21:13:33 -07:00
Jacob Keller 36777d9fa2 i40e: check current configured input set when adding ntuple filters
Do not assume that hardware has been programmed with the default mask,
but instead read the input set registers to determine what is currently
programmed. This ensures that all programmed filters match exactly how
the hardware will interpret them, avoiding confusion regarding filter
behavior.

This sets the initial ground-work for allowing custom input sets where
some fields are disabled. A future patch will fully implement this
feature.

Instead of using bitwise negation, we'll just explicitly check for the
correct value. The use of htonl and htons are used to silence sparse
warnings. The compiler should be able to handle the constant value and
avoid actually performing a byteswap.

Change-Id: I3d8db46cb28ea0afdaac8c5b31a2bfb90e3a4102
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-23 21:13:33 -07:00
Jacob Keller faa16e0f38 i40e: correctly honor the mask fields for ETHTOOL_SRXCLSRLINS
The current implementation of .set_rxnfc does not properly read the mask
field for filter entries. This results in incorrect driver behavior, as
we do not reject filters which have masks set to ignore some fields. The
current implementation simply assumes that every part of the tuple or
"input set" is specified. This results in filters not behaving as
expected, and not working correctly.

As a first step in supporting some partial filters, add code which
checks the mask fields and rejects any filters which do not have an
acceptable mask. For now, we just assume that all fields must be set.

This will get the driver one step towards allowing some partial filters.
At a minimum, the ethtool commands which previously installed filters
that would not function will now return a non-zero exit code indicating
failure instead.

We should now be meeting the minimum requirements of the .set_rxnfc API,
by ensuring that all filters we program have a valid mask value for each
field.

Finally, add code to report the mask correctly so that the ethtool
command properly reports the mask to the user.

Note that the typecast to (__be16) when checking source and destination
port masks is required because the ~ bitwise negation operator does not
correctly handle variables other than integer size.

Change-Id: Ia020149e07c87aa3fcec7b2283621b887ef0546f
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-23 21:13:33 -07:00
Jie Deng 67ff2c71bb net: dwc-xlgmac: use dual license
The driver "dwc-xlgmac" is dual-licensed.
Declare the dual license with MODULE_LICENSE().

Signed-off-by: Jie Deng <jiedeng@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 17:04:14 -07:00
Jie Deng ea8c1c642e net: dwc-xlgmac: declaration of dual license in headers
The driver "dwc-xlgmac" is dual-licensed. This patch adds
declaration of dual license in file headers.

Signed-off-by: Jie Deng <jiedeng@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 17:04:14 -07:00
David S. Miller 16ae1f2236 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/broadcom/genet/bcmmii.c
	drivers/net/hyperv/netvsc.c
	kernel/bpf/hashtab.c

Almost entirely overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 16:41:27 -07:00
Mintz, Yuval dec26533ae qed: Reserve VF feature before PF
Align the driver feature distribution with the flow utilized
by the management firmware - first reserve L2 queues for
VFs and use all the remaining for the PF.

The current distribution might lead to PFs with an enormous
amount of queues, but at the same time leave us with insufficient
resources for starting all VFs.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 11:53:30 -07:00
Mintz, Yuval 810bb1f0d3 qed: Don't waste SBs unused by RoCE
When RoCE is enabled on a given L2 interface, the interrupt lines
are divided equally between L2 and RoCE -
But in case number of lines needed for RoCE is limited by number
of available CNQs, we can utilize the additional lines for L2.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 11:53:30 -07:00
Mintz, Yuval 3981594409 qed: Reduce verbosity of unimplemented MFW messages
Management firmware and driver are meant to be both backward and forward
compatibile with each other.

If a new mangement firmware would work with an older driver,
it's possible that driver would receive indications which are meaningless
to it. That's perfectly acceptible from the firmware part - so no need to
log such messages at default verbosity; That would only serve to confuse
users.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 11:53:30 -07:00
Mintz, Yuval 1799100207 qed: Correct endian order of MAC passed to MFW
The management firmware is running on a Big Endian processor,
and when running on LE platform HW is configured to swap access
to memory shared between management firmware and driver on
32-bit granulariy.

As a result, for matters of simplicity most of the APIs between
driver and management firmware are based on 32-bit variables.
MAC settings are one exception, as driver needs to fill a byte
array when indicating to management firmware that primary MAC
has changed.
Due to the swap, driver must make sure that the mac that was
provided in byte-order would be translated into native order,
otherwise after the swap the management firmware would read
it swapped.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 11:53:30 -07:00
Tomer Tayar 2f67af8c59 qed: Pass src/dst sizes when interacting with MFW
The driver interaction with management firmware involves a union
of all the data-members relating to the commands the driver prepares.

Current interface assumes the caller always passes such a union -
but thats cumbersome as well as risky [chancing a stack corruption
in case caller accidentally passes a smaller member instead of union].

Change implementation so that caller could pass a pointer to any
of the members instead of the union.

Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 11:53:29 -07:00
Tomer Tayar 4ed1eea82a qed: Revise MFW command locking
Interaction of driver -> management firmware is based
on a one-pending mailbox [per interface], and various
mailbox commands need to be synchronized.

Current scheme is messy, and there's a difficulty extending
it as it deals differently with various commands as well as
making assumption on the required behavior for load/unload
requests.

Drop the current scheme into a completion-list-based approach;
Each flow would try sending the command when possible,
allowing one flow to complete another flow's completion and
relieve the mailbox before sending its own command.

Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 11:53:29 -07:00
Pavel Belous 68c3865903 net:ethernet:aquantia: Fix for RX checksum offload.
Since AQC-100/107/108 chips supports hardware checksums for RX we should indicate this
via NETIF_F_RXCSUM flag.

v1->v2: 'Signed-off-by' tag added.

Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 19:40:52 -07:00
Lendacky, Thomas f43feef4e6 amd-xgbe: Fix the ECC-related bit position definitions
The ECC bit positions that describe whether the ECC interrupt is for
Tx, Rx or descriptor memory and whether the it is a single correctable
or double detected error were defined in incorrectly (reversed order).
Fix the bit position definitions for these settings so that the proper
ECC handling is performed.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 19:39:58 -07:00
stephen hemminger ce12b81061 netvsc: fix and cleanup rndis_filter_set_packet_filter
Fix warning from unused set_complete variable. And rearrange code
to eliminate unnecessary goto's.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 19:38:57 -07:00
stephen hemminger ebc1dcf600 netvsc: eliminate unnecessary skb == NULL checks
Since there already is a special case goto for control messages (skb == NULL)
in netvsc_send, there is no need for later checks in same code path.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 19:38:57 -07:00
stephen hemminger 00ecfb3b34 netvsc: remove unnecessary lock on shutdown
The channel inbound lock was not being used at all by the netvsc
device, but the spin_lock was helpful by providing necessary
barrier before waiting.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 19:38:56 -07:00
stephen hemminger 43c7bd1ffc netvsc: use refcount_t for keeping track of sub channels
Rather than a lock and variable, use a refcount_t to keep track
of the number of sub channels.  Don't need to wait for subchannels
on device removal since wait was already done in device_add.

Also fix the error handling; don't wait forever in case of
an error on request to create sub channels.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 19:38:56 -07:00
stephen hemminger a0be450e19 netvsc: uses RCU instead of removal flag
It is cleaner to use RCU protected pointer (nvdev_ctx->nvdev)
to indicate device is in removed state, rather than having a separate
boolean flag. By using the pointer the context can be checked
by static checkers and dynamic lockdep.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 19:38:56 -07:00
stephen hemminger 545a8e79bd netvsc: use RCU to protect inner device structure
The netvsc driver has an internal structure (netvsc_device) which
is created when device is opened and released when device is closed.
And also opened/released when MTU or number of channels change.

Since this is referenced in the receive and transmit path, it is
safer to use RCU to protect/prevent use after free problems.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 19:38:56 -07:00
stephen hemminger 3071ada491 netvsc: change max channel calculation
The default number of maximum channels should be limited to the
number of cpus available on the numa node of the primary channel.
This also makes sure maximum channels <= num_online_cpus

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 19:38:56 -07:00