Commit Graph

23523 Commits

Author SHA1 Message Date
Yangbo Lu ceefc71d4c ptp: rework gianfar_ptp as QorIQ common PTP driver
gianfar_ptp was the PTP clock driver for 1588 timer
module of Freescale QorIQ eTSEC (Enhanced Three-Speed
Ethernet Controllers) platforms. Actually QorIQ DPAA
(Data Path Acceleration Architecture) platforms is
also using the same 1588 timer module in hardware.

This patch is to rework gianfar_ptp as QorIQ common
PTP driver to support both DPAA and eTSEC. Moved
gianfar_ptp.c to drivers/ptp/, renamed it as
ptp_qoriq.c, and renamed many variables. There were
not any function changes.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28 23:05:11 -04:00
Thierry Reding 29555fa3de net: stmmac: Use mutex instead of spinlock
Some drivers, such as DWC EQOS on Tegra, need to perform operations that
can sleep under this lock (clk_set_rate() in tegra_eqos_fix_speed()) for
proper operation. Since there is no need for this lock to be a spinlock,
convert it to a mutex instead.

Fixes: e6ea2d16fc ("net: stmmac: dwc-qos: Add Tegra186 support")
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Tested-by: Bhadram Varka <vbhadram@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28 22:54:15 -04:00
Sudarsana Reddy Kalluru 1b40428c04 bnx2x: Collect the device debug information during Tx timeout.
Tx-timeout mostly happens due to some issue in the device. In such cases,
debug dump would be helpful for identifying the cause of the issue.
This patch adds support to spill debug data during the Tx timeout. Here
bnx2x_panic_dump() API is used instead of bnx2x_panic(), since we still
want to allow the Tx-timeout recovery a chance to succeed.

Changes from previous version:
-------------------------------
v2: Fixed a coding error.

Please consider applying this to "net-next".

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28 22:52:52 -04:00
Christoph Hellwig 06e9552f5f x86/pci-dma: remove the experimental forcesac boot option
Limiting the dma mask to avoid PCI (pre-PCIe) DAC cycles while paying
the huge overhead of an IOMMU is rather pointless, and this seriously
gets in the way of dma mapping work.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2018-05-28 12:48:16 +02:00
David S. Miller 5b79c2af66 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Lots of easy overlapping changes in the confict
resolutions here.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-26 19:46:15 -04:00
Eran Ben Elisha 05909babce net/mlx5e: Avoid reset netdev stats on configuration changes
Move all RQ, SQ and channel counters from the channel objects into the
priv structure.  With this change, counters will not be reset upon
channel configuration changes.

Channel's statistics for SQs which are associated with TCs higher than
zero will be presented in ethtool -S, only for SQs which were opened at
least once since the module was loaded (regardless of their open/close
current status).  This is done in order to decrease the total amount of
statistics presented and calculated for the common out of box use (no
QoS).

mlx5e_channel_stats is a compound of CH,RQ,SQs stats in order to
create locality for the NAPI when handling TX and RX of the same
channel.

Align the new statistics struct per ring to avoid several channels
update to the same cache line at the same time.
Packet rate was tested, no degradation sensed.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
CC: Qing Huang <qing.huang@oracle.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-25 16:14:28 -07:00
Bjorn Helgaas 4695ca9d17 ixgbe: Report PCIe link properties with pcie_print_link_status()
Previously the driver used pcie_get_minimum_link() to warn when the NIC
is in a slot that can't supply as much bandwidth as the NIC could use.

pcie_get_minimum_link() can be misleading because it finds the slowest link
and the narrowest link (which may be different links) without considering
the total bandwidth of each link.  For a path with a 16 GT/s x1 link and a
2.5 GT/s x16 link, it returns 2.5 GT/s x1, which corresponds to 250 MB/s of
bandwidth, not the true available bandwidth of about 1969 MB/s for a
16 GT/s x1 link.

Use pcie_print_link_status() to report PCIe link speed and possible
limitations instead of implementing this in the driver itself.  This finds
the slowest link in the path to the device by computing the total bandwidth
of each link and compares that with the capabilities of the device.

The dmesg change is:

  - PCI Express bandwidth of %dGT/s available
  - (Speed:%s, Width: x%d, Encoding Loss:%s)
  + %u.%03u Gb/s available PCIe bandwidth (%s x%d link)

or, if the device is capable of better performance than is available in the
current slot:

  - This is not sufficient for optimal performance of this card.
  - For optimal performance, at least %dGT/s of bandwidth is required.
  - A slot with more lanes and/or higher speed is suggested.
  + %u.%03u Gb/s available PCIe bandwidth, limited by %s x%d link at %s (capable of %u.%03u Gb/s with %s x%d link)

Note that the driver previously used dev_warn() to suggest using a
different slot, but pcie_print_link_status() uses dev_info() because if the
platform has no faster slot available, the user can't do anything about the
warning and may not want to be bothered with it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-25 17:29:49 -05:00
Bjorn Helgaas 57d12fc6f7 cxgb4: Report PCIe link properties with pcie_print_link_status()
Previously the driver used pcie_get_minimum_link() to warn when the NIC
is in a slot that can't supply as much bandwidth as the NIC could use.

pcie_get_minimum_link() can be misleading because it finds the slowest link
and the narrowest link (which may be different links) without considering
the total bandwidth of each link.  For a path with a 16 GT/s x1 link and a
2.5 GT/s x16 link, it returns 2.5 GT/s x1, which corresponds to 250 MB/s of
bandwidth, not the true available bandwidth of about 1969 MB/s for a
16 GT/s x1 link.

Use pcie_print_link_status() to report PCIe link speed and possible
limitations instead of implementing this in the driver itself.  This finds
the slowest link in the path to the device by computing the total bandwidth
of each link and compares that with the capabilities of the device.

The dmesg change is:

  - PCIe link speed is %s, device supports %s
  - PCIe link width is x%d, device supports x%d
  + %u.%03u Gb/s available PCIe bandwidth (%s x%d link)

or, if the device is capable of better performance than is available in the
current slot:

  - A slot with more lanes and/or higher speed is suggested for optimal performance.
  + %u.%03u Gb/s available PCIe bandwidth, limited by %s x%d link at %s (capable of %u.%03u Gb/s with %s x%d link)

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-05-25 17:29:49 -05:00
Bjorn Helgaas af125b754e bnxt_en: Report PCIe link properties with pcie_print_link_status()
Previously the driver used pcie_get_minimum_link() to warn when the NIC
is in a slot that can't supply as much bandwidth as the NIC could use.

pcie_get_minimum_link() can be misleading because it finds the slowest link
and the narrowest link (which may be different links) without considering
the total bandwidth of each link.  For a path with a 16 GT/s x1 link and a
2.5 GT/s x16 link, it returns 2.5 GT/s x1, which corresponds to 250 MB/s of
bandwidth, not the true available bandwidth of about 1969 MB/s for a
16 GT/s x1 link.

Use pcie_print_link_status() to report PCIe link speed and possible
limitations instead of implementing this in the driver itself.  This finds
the slowest link in the path to the device by computing the total bandwidth
of each link and compares that with the capabilities of the device.

The dmesg change is:

  - PCIe: Speed %s Width x%d
  + %u.%03u Gb/s available PCIe bandwidth (%s x%d link)

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-05-25 17:29:49 -05:00
Bjorn Helgaas cc04a1dd3f bnx2x: Report PCIe link properties with pcie_print_link_status()
Previously the driver used pcie_get_minimum_link() to warn when the NIC
is in a slot that can't supply as much bandwidth as the NIC could use.

pcie_get_minimum_link() can be misleading because it finds the slowest link
and the narrowest link (which may be different links) without considering
the total bandwidth of each link.  For a path with a 16 GT/s x1 link and a
2.5 GT/s x16 link, it returns 2.5 GT/s x1, which corresponds to 250 MB/s of
bandwidth, not the true available bandwidth of about 1969 MB/s for a
16 GT/s x1 link.

Use pcie_print_link_status() to report PCIe link speed and possible
limitations instead of implementing this in the driver itself.  This finds
the slowest link in the path to the device by computing the total bandwidth
of each link and compares that with the capabilities of the device.

The dmesg change is:

  - %s (%c%d) PCI-E x%d %s found at mem %lx, IRQ %d, node addr %pM
  + %s (%c%d) PCI-E found at mem %lx, IRQ %d, node addr %pM
  + %u.%03u Gb/s available PCIe bandwidth (%s x%d link)

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-05-25 17:29:49 -05:00
Shalom Lagziel 868a01a27d net/mlx5e: Introducing new statistics rwlock
Introduce a new read/write lock that will protect statistics gathering from
netdev channels configuration changes.
e.g. when channels are being replaced (increase/decrease number of rings)
prevent statistic gathering (ndo_get_stats64) to read the statistics of
in-active channels (channels that are being closed).

Plus update channels software statistics on the fly when calling
ndo_get_stats64, and remove it from stats periodic work.

Fixes: 9218b44dcc ("net/mlx5e: Statistics handling refactoring")
Signed-off-by: Shalom Lagziel <shaloml@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-25 14:11:01 -07:00
Saeed Mahameed 6ab75516cf net/mlx5e: Move phy link down events counter out of SW stats
PHY link down events counter belongs to phy_counters group.
although it has special handling, it doesn't mean it can't be there.

Move it to phy_counters_grp handler.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-25 14:11:01 -07:00
Tariq Toukan 3a2f703312 net/mlx5: Use order-0 allocations for all WQ types
Complete the transition of all WQ types to use fragmented
order-0 coherent memory instead of high-order allocations.

CQ-WQ already uses order-0.
Here we do the same for cyclic and linked-list WQs.

This allows the driver to load cleanly on systems with a highly
fragmented coherent memory.

Performance tests:
ConnectX-5 100Gbps, CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Packet rate of 64B packets, single transmit ring, size 8K.

No degradation is sensed.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-25 14:11:00 -07:00
Tariq Toukan 549322f2f9 net/mlx5i: Use compilation flag in IPOIB header
If CONFIG_MLX5_CORE_IPOIB is not set, compile-out the
IPOIB related headers.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-25 14:11:00 -07:00
Tariq Toukan 043dc78ecf net/mlx5e: TX, Use actual WQE size for SQ edge fill
We fill SQ edge with NOPs to avoid WQEs wrap.
Here, instead of doing that in advance for the maximum possible
WQE size, we do it on-demand using the actual WQE size.
We re-order some parts in mlx5e_sq_xmit to finish the calculation
of WQE size (ds_cnt) before doing any writes to the WQE buffer.

When SQ work queue is fragmented (introduced in an downstream patch),
dealing with WQE wraps becomes more frequent. This change would drastically
reduce the overhead in this case.

Performance tests:
ConnectX-5 100Gbps, CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Packet rate of 64B packets, single transmit ring, size 8K.

Before: 14.9 Mpps
After:  15.8 Mpps

Improvement of 6%.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-25 14:11:00 -07:00
Tariq Toukan ddf385e31f net/mlx5e: Use WQ API functions instead of direct fields access
Use the WQ API to get the WQ size, and to map a counter
into a WQ entry index.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-25 14:11:00 -07:00
Chris Mi e4ad91f23f net/mlx5e: Split offloaded eswitch TC rules for port mirroring
If a TC rule needs to be split for mirroring, create two HW rules,
in the first level and the second level flow tables accordingly.

In the first level flow table, forward the packet to the mirror
port and forward the packet to the second level flow table for
further processing, eg. encap, vlan push or header re-write.

Currently the matching is repeated in both stages.

While here, simplify the setup of the vhca id valid indicator also
in the existing code.

Signed-off-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-25 14:11:00 -07:00
Chris Mi 592d365159 net/mlx5e: Parse mirroring action for offloaded TC eswitch flows
Currently, we only support the mirred redirect TC sub-action. In order
to support flow based vport mirroring, add support to parse the mirred
mirror sub-action.

For mirroring, user-space will typically set the action order such that
the mirror port (mirror VF) sees packets as the original port (VF under
mirroring) sent them or as it will receive them.

In the general case, it means that packets are potentially sent to the
mirror port before or after some actions were applied on them. To
properly do that, we should follow on the exact action order as set for
the flow and make sure this will also be the case when we program the HW
offload.

We introduce a counter for the output ports (attr->out_count), which we
increase when parsing each mirred redirect/mirror sub-action and when
dealing with encap.

We introduce a counter (attr->mirror_count) telling us if split is
needed. If no split is needed and mirroring is just multicasting to
vport, the mirror count is zero, all the actions of the TC flow should
apply on that single HW flow.

If split is needed, the mirror count tells where to do the split, all
non-mirred tc actions should apply only after the split.

The mirror count is set while parsing the following actions encap/decap,
header re-write, vlan push/pop.

Signed-off-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-25 14:11:00 -07:00
Chris Mi a842dd04cf net/mlx5: E-switch, Create a second level FDB flow table
If firmware supports the forward action with a destination list
that includes a flow table, create a second level FDB flow table.

This is going to be used for flow based mirroring under the switchdev
offloads mode.

Signed-off-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-25 14:11:00 -07:00
Chris Mi 52fff3274b net/mlx5: E-Switch, Reorganize and rename fdb flow tables
We have several fdb flow tables for each of the legacy and switchdev
modes. In the switchdev mode, there are fast path and slow path flow
tables. Towards adding more flow tables in upcoming patches, reorganize
and rename the various existing ones to reflect their functionality.

Signed-off-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-25 14:11:00 -07:00
David S. Miller d7c52fc8bc mlx5e-updates-2018-05-19
This series contains updates for mlx5e netdevice driver with one subject,
 DSCP to priority mapping, in the first patch Huy adds the needed API in
 dcbnl, the second patch adds the needed mlx5 core capability bits for the
 feature, and all other patches are mlx5e (netdev) only changes to add
 support for the feature.
 
 From: Huy Nguyen
 
 Dscp to priority mapping for Ethernet packet:
 
 These patches enable differentiated services code point (dscp) to
 priority mapping for Ethernet packet. Once this feature is
 enabled, the packet is routed to the corresponding priority based on its
 dscp. User can combine this feature with priority flow control (pfc)
 feature to have priority flow control based on the dscp.
 
 Firmware interface:
 Mellanox firmware provides two control knobs for this feature:
   QPTS register allow changing the trust state between dscp and
   pcp mode. The default is pcp mode. Once in dscp mode, firmware will
   route the packet based on its dscp value if the dscp field exists.
 
   QPDPM register allow mapping a specific dscp (0 to 63) to a
   specific priority (0 to 7). By default, all the dscps are mapped to
   priority zero.
 
 Software interface:
 This feature is controlled via application priority TLV. IEEE
 specification P802.1Qcd/D2.1 defines priority selector id 5 for
 application priority TLV. This APP TLV selector defines DSCP to priority
 map. This APP TLV can be sent by the switch or can be set locally using
 software such as lldptool. In mlx5 drivers, we add the support for net
 dcb's getapp and setapp call back. Mlx5 driver only handles the selector
 id 5 application entry (dscp application priority application entry).
 If user sends multiple dscp to priority APP TLV entries on the same
 dscp, the last sent one will take effect. All the previous sent will be
 deleted.
 
 This attribute combined with pfc attribute allows advanced user to
 fine tune the qos setting for specific priority queue. For example,
 user can give dedicated buffer for one or more priorities or user
 can give large buffer to certain priorities.
 
 The dcb buffer configuration will be controlled by lldptool.
 >> lldptool -T -i eth2 -V BUFFER prio 0,2,5,7,1,2,3,6
       maps priorities 0,1,2,3,4,5,6,7 to receive buffer 0,2,5,7,1,2,3,6
 >> lldptool -T -i eth2 -V BUFFER size 87296,87296,0,87296,0,0,0,0
       sets receive buffer size for buffer 0,1,2,3,4,5,6,7 respectively
 
 After discussion on mailing list with Jakub, Jiri, Ido and John, we agreed to
 choose dcbnl over devlink interface since this feature is intended to set
 port attributes which are governed by the netdev instance of that port, where
 devlink API is more suitable for global ASIC configurations.
 
 The firmware trust state (in QPTS register) is changed based on the
 number of dscp to priority application entries. When the first dscp to
 priority application entry is added by the user, the trust state is
 changed to dscp. When the last dscp to priority application entry is
 deleted by the user, the trust state is changed to pcp.
 
 When the port is in DSCP trust state, the transmit queue is selected
 based on the dscp of the skb.
 
 When the port is in DSCP trust state and vport inline mode is not NONE,
 firmware requires mlx5 driver to copy the IP header to the
 wqe ethernet segment inline header if the skb has it.
 This is done by changing the transmit queue sq's min inline mode to L3.
 Note that the min inline mode of sqs that belong to other features
 such as xdpsq, icosq are not modified.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbBy2YAAoJEEg/ir3gV/o+RkYIAId0h3GqVof41/qCHPuqx+7j
 J/lTJSU+PHDZfVAuXhoQN5lFGsRku/wkqdSZNXxwvbUxVzIoUpxWvlaEv+481Cmj
 +WhVC1QIExGYUs4CCGjgl853kNtGpDrZmk+olY3QaQISHdpcmgs/W13YlEKRyw5q
 nFSa6GLA/3IQXb7eOwUiSW1gYYjn/eM2XFHntTTOeasa9dwb2QDqDQdoYiKdS+yS
 KxqxrlADU4V+CtNlOmZ4QGxOa7suBBr+a6JvNnvviYOKKpYMNQRbhUbgu0pPdKAt
 zbHHS7Twy1ka3s6CKk/QIT7/YWfCOQgWy6LGPCtUuGUA6dy5xsV7i2BtoDwMC7o=
 =EJd2
 -----END PGP SIGNATURE-----

Merge tag 'mlx5e-updates-2018-05-19' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5e-updates-2018-05-19

This series contains updates for mlx5e netdevice driver with one subject,
DSCP to priority mapping, in the first patch Huy adds the needed API in
dcbnl, the second patch adds the needed mlx5 core capability bits for the
feature, and all other patches are mlx5e (netdev) only changes to add
support for the feature.

From: Huy Nguyen

Dscp to priority mapping for Ethernet packet:

These patches enable differentiated services code point (dscp) to
priority mapping for Ethernet packet. Once this feature is
enabled, the packet is routed to the corresponding priority based on its
dscp. User can combine this feature with priority flow control (pfc)
feature to have priority flow control based on the dscp.

Firmware interface:
Mellanox firmware provides two control knobs for this feature:
  QPTS register allow changing the trust state between dscp and
  pcp mode. The default is pcp mode. Once in dscp mode, firmware will
  route the packet based on its dscp value if the dscp field exists.

  QPDPM register allow mapping a specific dscp (0 to 63) to a
  specific priority (0 to 7). By default, all the dscps are mapped to
  priority zero.

Software interface:
This feature is controlled via application priority TLV. IEEE
specification P802.1Qcd/D2.1 defines priority selector id 5 for
application priority TLV. This APP TLV selector defines DSCP to priority
map. This APP TLV can be sent by the switch or can be set locally using
software such as lldptool. In mlx5 drivers, we add the support for net
dcb's getapp and setapp call back. Mlx5 driver only handles the selector
id 5 application entry (dscp application priority application entry).
If user sends multiple dscp to priority APP TLV entries on the same
dscp, the last sent one will take effect. All the previous sent will be
deleted.

This attribute combined with pfc attribute allows advanced user to
fine tune the qos setting for specific priority queue. For example,
user can give dedicated buffer for one or more priorities or user
can give large buffer to certain priorities.

The dcb buffer configuration will be controlled by lldptool.
>> lldptool -T -i eth2 -V BUFFER prio 0,2,5,7,1,2,3,6
      maps priorities 0,1,2,3,4,5,6,7 to receive buffer 0,2,5,7,1,2,3,6
>> lldptool -T -i eth2 -V BUFFER size 87296,87296,0,87296,0,0,0,0
      sets receive buffer size for buffer 0,1,2,3,4,5,6,7 respectively

After discussion on mailing list with Jakub, Jiri, Ido and John, we agreed to
choose dcbnl over devlink interface since this feature is intended to set
port attributes which are governed by the netdev instance of that port, where
devlink API is more suitable for global ASIC configurations.

The firmware trust state (in QPTS register) is changed based on the
number of dscp to priority application entries. When the first dscp to
priority application entry is added by the user, the trust state is
changed to dscp. When the last dscp to priority application entry is
deleted by the user, the trust state is changed to pcp.

When the port is in DSCP trust state, the transmit queue is selected
based on the dscp of the skb.

When the port is in DSCP trust state and vport inline mode is not NONE,
firmware requires mlx5 driver to copy the IP header to the
wqe ethernet segment inline header if the skb has it.
This is done by changing the transmit queue sq's min inline mode to L3.
Note that the min inline mode of sqs that belong to other features
such as xdpsq, icosq are not modified.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25 16:41:23 -04:00
Bo Chen a45675796a 8139too: Remove unnecessary netif_napi_del()
The call to free_netdev() in __rtl8139_cleanup_dev() clears the network device
napi list, and explicit calls to netif_napi_del() are unnecessary.

Signed-off-by: Bo Chen <chenbo@pdx.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25 16:35:45 -04:00
Thomas Falcon eb110410b9 ibmvnic: Fix partial success login retries
In its current state, the driver will handle backing device
login in a loop for a certain number of retries while the
device returns a partial success, indicating that the driver
may need to try again using a smaller number of resources.

The variable it checks to continue retrying may change
over the course of operations, resulting in reallocation
of resources but exits without sending the login attempt.
Guard against this by introducing a boolean variable that
will retain the state indicating that the driver needs to
reattempt login with backing device firmware.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25 16:32:48 -04:00
Manish Chopra 608e00d0a2 qed*: Support drop action classification
With this patch, User can configure for the supported
flows to be dropped. Added a stat "gft_filter_drop"
as well to be populated in ethtool for the dropped flows.

For example -

ethtool -N p5p1 flow-type udp4 dst-port 8000 action -1
ethtool -N p5p1 flow-type tcp4 scr-ip 192.168.8.1 action -1

Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25 16:10:42 -04:00
Manish Chopra 39385ab02c qede: Support flow classification to the VFs.
With the supported classification modes [4 tuples based,
udp port based, src-ip based], flows can be classified
to the VFs as well. With this patch, flows can be re-directed
to the requested VF provided in "action" field of command.

Please note that driver doesn't really care about the queue bits
in "action" field for the VFs. Since queue will be still chosen
by FW using RSS hash. [I.e., the classification would be done
according to vport-only]

For examples -

ethtool -N p5p1 flow-type udp4 dst-port 8000 action 0x100000000
ethtool -N p5p1 flow-type tcp4 src-ip 192.16.6.10 action 0x200000000
ethtool -U p5p1 flow-type tcp4 src-ip 192.168.40.100 dst-ip \
	192.168.40.200 src-port 6660 dst-port 5550 \
	action 0x100000000

Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25 16:10:42 -04:00
Manish Chopra 3893fc62b1 qed*: Support other classification modes.
Currently, driver supports flow classification to PF
receive queues based on TCP/UDP 4 tuples [src_ip, dst_ip,
src_port, dst_port] only.

This patch enables to configure different flow profiles
[For example - only UDP dest port or src_ip based] on the
adapter so that classification can be done according to
just those fields as well. Although, at a time just one
type of flow configuration is supported due to limited
number of flow profiles available on the device.

For example -

ethtool -N enp7s0f0 flow-type udp4 dst-port 45762 action 2
ethtool -N enp7s0f0 flow-type tcp4 src-ip 192.16.4.10 action 1
ethtool -N enp7s0f0 flow-type udp6 dst-port 45762 action 3

Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25 16:10:42 -04:00
Manish Chopra 89ffd14ee9 qede: Validate unsupported configurations
Validate and prevent some of the configurations for
unsupported [by firmware] inputs [for example - mac ext,
vlans, masks/prefix, tos/tclass] via ethtool -N/-U.

Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25 16:10:42 -04:00
Manish Chopra 87885310c1 qede: Refactor ethtool rx classification flow.
This patch simplifies the ethtool rx flow configuration
[via ethtool -U/-N] flow code base by dividing it logically
into various APIs based on given protocols. It also separates
various validations and calculations done along the flow
in their own APIs.

Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25 16:10:42 -04:00
Arjun Vynipadath e2f4f4e927 cxgb4/cxgb4vf: Notify link changes to OS-dependent code
We have a confusion of two different abstractions in the Common
Code:  Physical Link (Port) and Logical Network Interface (Virtual
Interface), and we haven't been properly managing the state of the
intersection of those two abstractions.
On the one hand we have the Physical state of the Link -- up or down --
and on the other we have the logical state of the VI, enabled or not.
{ethN} refers to both the Physical and Logical State. In this case,
ifconfig only affects/interrogates the Logical State of a VI,
and ethtool only deals with the Physical State. And these are different.

So, just because we disable the VI, we don't really want to change the
Physical Link Up/Down state.  Thus, the previous hack to set
"lc->link_ok = 0" when we disable a VI is completely incorrect.

Where we get into trouble is where the Physical Link State and the
Logical VI State cross swords.  And that happens in
t4_handle_get_port_info() where we need to manage/safe the Physical
Link State, but we also need to know when the Logical VI State has
changed and pass that back up to the OS-dependent Driver routine
t4_os_link_changed() which is concerned about the Logical Interface.

So we enable a VI and that causes Firmware to send us a new Port
Information message, but if none of the Physical Link State
particulars have changed, we don't call t4_os_link_changed().

This fix uses the existing OS Contract APIs for the Common Code to
inform the OS-dependent portion of the Host Driver when the "Link" (really
Logical Network Interface) is "up" or "down". A new API
t4_enable_pi_params() is added which calls t4_enable_vi_params() and,
if that is successful, then calls back to the OS Contract API
t4_os_link_changed() notifying the OS-dependent layer of the
potential Link State change.

Original Work by : Casey Leedom <leedom@chelsio.com>

Signed-off-by: Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25 15:10:07 -04:00
Ganesh Goudar e8d452923a cxgb4: clean up init_one
clean up init_one and use chip_ver consistently throughout
init_one() for chip version.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25 14:59:38 -04:00
Ganesh Goudar 57ccaedb74 cxgb4/cxgb4vf: link management changes for new SFP
newer SFPs like SFP28 and QSFP28 Transceiver Modules present
several new possibilities which we haven't faced before. Fix the
assumptions in the code reflecting the more limited capabilities
of previous Transceiver Module systems

Original work by Casey Leedom <leedom@chelsio.com>

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25 14:56:10 -04:00
YueHaibing b526e56b31 net: fec: remove stale comment
This comment is outdated as fec_ptp_ioctl has been replaced by fec_ptp_set/fec_ptp_get
since commit 1d5244d0e4 ("fec: Implement the SIOCGHWTSTAMP ioctl")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25 14:53:02 -04:00
Martin Habets 0c235113b3 sfc: stop the TX queue before pushing new buffers
efx_enqueue_skb() can push new buffers for the xmit_more functionality.
We must stops the TX queue before this or else the TX queue does not get
restarted and we get a netdev watchdog.

In the error handling we may now need to unwind more than 1 packet, and
we may need to push the new buffers onto the partner queue.

v2: In the error leg also push this queue if xmit_more is set

Fixes: e9117e5099 ("sfc: Firmware-Assisted TSO version 2")
Reported-by: Jarod Wilson <jarod@redhat.com>
Tested-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Martin Habets <mhabets@solarflare.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25 14:49:37 -04:00
Qing Huang 1383cb8103 mlx4_core: allocate ICM memory in page size chunks
When a system is under memory presure (high usage with fragments),
the original 256KB ICM chunk allocations will likely trigger kernel
memory management to enter slow path doing memory compact/migration
ops in order to complete high order memory allocations.

When that happens, user processes calling uverb APIs may get stuck
for more than 120s easily even though there are a lot of free pages
in smaller chunks available in the system.

Syslog:
...
Dec 10 09:04:51 slcc03db02 kernel: [397078.572732] INFO: task
oracle_205573_e:205573 blocked for more than 120 seconds.
...

With 4KB ICM chunk size on x86_64 arch, the above issue is fixed.

However in order to support smaller ICM chunk size, we need to fix
another issue in large size kcalloc allocations.

E.g.
Setting log_num_mtt=30 requires 1G mtt entries. With the 4KB ICM chunk
size, each ICM chunk can only hold 512 mtt entries (8 bytes for each mtt
entry). So we need a 16MB allocation for a table->icm pointer array to
hold 2M pointers which can easily cause kcalloc to fail.

The solution is to use kvzalloc to replace kcalloc which will fall back
to vmalloc automatically if kmalloc fails.

Signed-off-by: Qing Huang <qing.huang@oracle.com>
Acked-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25 10:22:53 -04:00
John Hurley 7e24a59311 nfp: flower: compute link aggregation action
If the egress device of an offloaded rule is a LAG port, then encode the
output port to the NFP with a LAG identifier and the offloaded group ID.

A prelag action is also offloaded which must be the first action of the
series (although may appear after other pre-actions - e.g. tunnels). This
causes the FW to check that it has the necessary information to output to
the requested LAG port. If it does not, the packet is sent to the kernel
before any other actions are applied to it.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24 23:10:57 -04:00
John Hurley 2e1cc5226b nfp: flower: implement host cmsg handler for LAG
Adds the control message handler to synchronize offloaded group config
with that of the kernel. Such messages are sent from fw to driver and
feature the following 3 flags:

- Data: an attached cmsg could not be processed - store for retransmission
- Xon: FW can accept new messages - retransmit any stored cmsgs
- Sync: full sync requested so retransmit all kernel LAG group info

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24 23:10:57 -04:00
John Hurley bb9a8d0311 nfp: flower: monitor and offload LAG groups
Monitor LAG events via the NETDEV_CHANGEUPPER/NETDEV_CHANGELOWERSTATE
notifiers to maintain a list of offloadable groups. Sync these groups with
HW via a delayed workqueue to prevent excessive re-configuration. When the
workqueue is triggered it may generate multiple control messages for
different groups. These messages are linked via a batch ID and flags to
indicate a new batch and the end of a batch.

Update private data in each repr to track their LAG lower state flags. The
state of a repr is used to determine the active netdevs that can be
offloaded. For example, in active-backup mode, we only offload the netdev
currently active.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24 23:10:57 -04:00
John Hurley b945245297 nfp: flower: add per repr private data for LAG offload
Add a bitmap to each flower repr to track its state if it is enslaved by a
bond. This LAG state may be different to the port state - for example, the
port may be up but LAG state may be down due to the selection in an
active/backup bond.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24 23:10:56 -04:00
John Hurley 898bc7d634 nfp: flower: check for/turn on LAG support in firmware
Check if the fw contains the _abi_flower_balance_sync_enable symbol. If it
does then write a 1 to this indicating that the driver is willing to
receive NIC to kernel LAG related control messages.

If the write is successful, update the list of extra features supported by
the fw and add a stub to accept LAG cmsgs.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24 23:10:56 -04:00
John Hurley 1945ca7a81 nfp: nfpcore: add rtsym writing function
Add an rtsym API function that combines the lookup of a symbol and the
writing of a value to it. Values can be written as unsigned 32 or 64 bits.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24 23:10:56 -04:00
John Hurley 24f132e29c nfp: add ndo_set_mac_address for representors
Adding a netdev to a bond requires that its mac address can be modified.
The default eth_mac_addr is sufficient to satisfy this requirement.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24 23:10:56 -04:00
Govindarajulu Varadarajan 322eaa06d5 enic: set DMA mask to 47 bit
In commit 624dbf55a3 ("driver/net: enic: Try DMA 64 first, then
failover to DMA") DMA mask was changed from 40 bits to 64 bits.
Hardware actually supports only 47 bits.

Fixes: 624dbf55a3 ("driver/net: enic: Try DMA 64 first, then failover to DMA")
Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24 23:05:30 -04:00
David S. Miller 90fed9c946 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2018-05-24

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Björn Töpel cleans up AF_XDP (removes rebind, explicit cache alignment from uapi, etc).

2) David Ahern adds mtu checks to bpf_ipv{4,6}_fib_lookup() helpers.

3) Jesper Dangaard Brouer adds bulking support to ndo_xdp_xmit.

4) Jiong Wang adds support for indirect and arithmetic shifts to NFP

5) Martin KaFai Lau cleans up BTF uapi and makes the btf_header extensible.

6) Mathieu Xhonneux adds an End.BPF action to seg6local with BPF helpers allowing
   to edit/grow/shrink a SRH and apply on a packet generic SRv6 actions.

7) Sandipan Das adds support for bpf2bpf function calls in ppc64 JIT.

8) Yonghong Song adds BPF_TASK_FD_QUERY command for introspection of tracing events.

9) other misc fixes from Gustavo A. R. Silva, Sirio Balmelli, John Fastabend, and Magnus Karlsson
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24 22:20:51 -04:00
Thomas Falcon 2770a7984d ibmvnic: Introduce hard reset recovery
Introduce a recovery hard reset to handle reset failure as a result of
change of device context following a transport event, such as a
backing device failover or partition migration. These operations reset
the device context to its initial state. If this occurs during a reset,
any initialization commands are likely to fail with an invalid state
error as backing device firmware requests reinitialization.

When this happens, make one more attempt by performing a hard reset,
which frees any resources currently allocated and performs device
initialization. If a transport event occurs during a device reset, a
flag is set which will trigger a new hard reset following the
completionof the current reset event.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24 22:19:26 -04:00
Thomas Falcon 06e43d7f9f ibmvnic: Set resetting state at earliest possible point
Set device resetting state at the earliest possible point: as soon as a
reset is successfully scheduled. The reset state is toggled off when
all resets have been processed to completion.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24 22:19:26 -04:00
Thomas Falcon 8a348450a0 ibmvnic: Create separate initialization routine for resets
Instead of having one initialization routine for all cases, create
a separate, simpler function for standard initialization, such as during
device probe. Use the original initialization function to handle
device reset scenarios. The goal of this patch is to avoid having
a single, cluttered init function to handle all possible
scenarios.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24 22:19:26 -04:00
Thomas Falcon ab5ec33b9a ibmvnic: Handle error case when setting link state
If setting the link state is not successful, print a warning
with the resulting return code and return it to be handled
by the caller.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24 22:19:26 -04:00
Thomas Falcon 17c8705838 ibmvnic: Return error code if init interrupted by transport event
If device init is interrupted by a failover, set the init return
code so that it can be checked and handled appropriately by the
init routine.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24 22:19:26 -04:00
Thomas Falcon 9c4eaabd1b ibmvnic: Check CRQ command return codes
Check whether CRQ command is successful before awaiting a response
from the management partition. If the command was not successful, the
driver may hang waiting for a response that will never come.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24 22:19:26 -04:00
Thomas Falcon 5153698e55 ibmvnic: Introduce active CRQ state
Introduce an "active" state for a IBM vNIC Command-Response Queue. A CRQ
is considered active once it has initialized or linked with its partner by
sending an initialization request and getting a successful response back
from the management partition.  Until this has happened, do not allow CRQ
commands to be sent other than the initialization request.

This change will avoid a protocol error in case of a device transport
event occurring during a initialization. When the driver receives a
transport event notification indicating that the backing hardware
has changed and needs reinitialization, any further commands other
than the initialization handshake with the VIOS management partition
will result in an invalid state error. Instead of sending a command
that will be returned with an error, print a warning and return an
error that will be handled by the caller.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24 22:19:25 -04:00
Thomas Falcon c3f2241547 ibmvnic: Mark NAPI flag as disabled when released
Set adapter NAPI state as disabled if they are removed. This will allow
them to be enabled again if reallocated in case of a hard reset.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24 22:19:25 -04:00
YueHaibing d624613e42 cxgb4: Check for kvzalloc allocation failure
t4_prep_fw doesn't check for card_fw pointer before store the read data,
which could lead to a NULL pointer dereference if kvzalloc failed.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24 21:52:44 -04:00
Jesper Dangaard Brouer 735fc4054b xdp: change ndo_xdp_xmit API to support bulking
This patch change the API for ndo_xdp_xmit to support bulking
xdp_frames.

When kernel is compiled with CONFIG_RETPOLINE, XDP sees a huge slowdown.
Most of the slowdown is caused by DMA API indirect function calls, but
also the net_device->ndo_xdp_xmit() call.

Benchmarked patch with CONFIG_RETPOLINE, using xdp_redirect_map with
single flow/core test (CPU E5-1650 v4 @ 3.60GHz), showed
performance improved:
 for driver ixgbe: 6,042,682 pps -> 6,853,768 pps = +811,086 pps
 for driver i40e : 6,187,169 pps -> 6,724,519 pps = +537,350 pps

With frames avail as a bulk inside the driver ndo_xdp_xmit call,
further optimizations are possible, like bulk DMA-mapping for TX.

Testing without CONFIG_RETPOLINE show the same performance for
physical NIC drivers.

The virtual NIC driver tun sees a huge performance boost, as it can
avoid doing per frame producer locking, but instead amortize the
locking cost over the bulk.

V2: Fix compile errors reported by kbuild test robot <lkp@intel.com>
V4: Isolated ndo, driver changes and callers.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-05-24 18:36:15 -07:00
Yossi Kuperman 1dcbc01f73 net/mlx5: IPSec, Fix a race between concurrent sandbox QP commands
Sandbox QP Commands are retired in the order they are sent. Outstanding
commands are stored in a linked-list in the order they appear. Once a
response is received and the callback gets called, we pull the first
element off the pending list, assuming they correspond.

Sending a message and adding it to the pending list is not done atomically,
hence there is an opportunity for a race between concurrent requests.

Bind both send and add under a critical section.

Fixes: bebb23e6cb ("net/mlx5: Accel, Add IPSec acceleration interface")
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Signed-off-by: Adi Nissim <adin@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-24 14:40:40 -07:00
Eran Ben Elisha 902a545904 net/mlx5e: When RXFCS is set, add FCS data into checksum calculation
When RXFCS feature is enabled, the HW do not strip the FCS data,
however it is not present in the checksum calculated by the HW.

Fix that by manually calculating the FCS checksum and adding it to the SKB
checksum field.

Add helper function to find the FCS data for all SKB forms (linear,
one fragment or more).

Fixes: 102722fc68 ("net/mlx5e: Add support for RXFCS feature flag")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-24 14:40:39 -07:00
Huy Nguyen ecdf2dadee net/mlx5e: Receive buffer support for DCBX
Add dcbnl's set/get buffer configuration callback that allows user to
set/get buffer size configuration and priority to buffer mapping.

By default, firmware controls receive buffer configuration and priority
of buffer mapping based on the changes in pfc settings. When set buffer
call back is triggered, the buffer configuration changes to manual mode.

The manual mode means mlx5 driver will adjust the buffer configuration
accordingly based on the changes in pfc settings.

ConnectX buffer stride is 128 Bytes. If the buffer size is not multiple
of 128, the buffer size will be rounded down to the nearest multiple of
128.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-24 14:23:33 -07:00
Huy Nguyen 0696d60853 net/mlx5e: Receive buffer configuration
Add APIs for buffer configuration based on the changes in
pfc configuration, cable len, buffer size configuration,
and priority to buffer mapping.

Note that the xoff fomula is as below
  xoff = ((301+2.16 * len [m]) * speed [Gbps] + 2.72 MTU [B]
  xoff_threshold = buffer_size - xoff
  xon_threshold = xoff_threshold - MTU

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-24 14:23:33 -07:00
Huy Nguyen 50b4a3c236 net/mlx5: PPTB and PBMC register firmware command support
Add firmware command interface to read and write PPTB and PBMC
registers.

PPTB register enables mappings priority to a specific receive buffer.

PBMC registers enables changing the receive buffer's configuration such
as buffer size, xon/xoff thresholds, buffer's lossy property and
buffer's shared property.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-24 14:23:33 -07:00
Huy Nguyen 2c81bfd5ae net/mlx5e: Move port speed code from en_ethtool.c to en/port.c
Move four below functions from en_ethtool.c to en/port.c. These
functions are used by both en_ethtool.c and en_main.c. Future code
can use these functions without ethtool link mode dependency.
  u32 mlx5e_port_ptys2speed(u32 eth_proto_oper);
  int mlx5e_port_linkspeed(struct mlx5_core_dev *mdev, u32 *speed);
  int mlx5e_port_max_linkspeed(struct mlx5_core_dev *mdev, u32 *speed);
  u32 mlx5e_port_speed2linkmodes(u32 speed);

Delete the speed field from table mlx5e_build_ptys2ethtool_map. This
table only keeps the mapping between the mlx5e link mode and
ethtool link mode. Add new table mlx5e_link_speed for translation
from mlx5e link mode to actual speed.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-24 14:23:33 -07:00
Jason Gunthorpe c62091bcd9 mlx5-updates-2018-05-17
mlx5 core dirver updates for both net-next and rdma-next branches.
 
 From Christophe JAILLET, first three patche to use kvfree where needed.
 
 From: Or Gerlitz <ogerlitz@mellanox.com>
 
 Next six patches from Roi and Co adds support for merged
 sriov e-switch which comes to serve cases where both PFs, VFs set
 on them and both uplinks are to be used in single v-switch SW model.
 When merged e-switch is supported, the per-port e-switch is logically
 merged into one e-switch that spans both physical ports and all the VFs.
 
 This model allows to offload TC eswitch rules between VFs belonging
 to different PFs (and hence have different eswitch affinity), it also
 sets the some of the foundations needed for uplink LAG support.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJa/fLEAAoJEEg/ir3gV/o+7jUH/3n5/Uw1LLt3TfeKArx6i0F1
 3G4U5B0ha03qiDqXprwhyQ3I6lgYmRBmjcxnqmvcqOAqO4/hSsjtTR+A/mgbEDhJ
 YtdekFNEX+72h/N2GIpZwChIWSE3EcMPaLYnV8TwLUgh9YSust2sCLSBbJCjxOKc
 j78M8ept/bXZwTm/iJhEjtmqw0xl91rl011chCAua0iEpH3wxteDARmKABFHMQxl
 I3N/x/e/astgcSCNgpO4uDf9zEIRkNdzcHPzSMJ6C2Oo5W9XiZEekfw7WKj9nXfa
 G+eGckkAyCOQ/r2lZ9nA0ZUvQ2X6JISvxgohuaCNwTgsz3acTxbLnQK4YWHzQCQ=
 =iHi6
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2018-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux into for-next

mlx5-updates-2018-05-17

mlx5 core dirver updates for both net-next and rdma-next branches.

From Christophe JAILLET, first three patche to use kvfree where needed.

From: Or Gerlitz <ogerlitz@mellanox.com>

Next six patches from Roi and Co adds support for merged
sriov e-switch which comes to serve cases where both PFs, VFs set
on them and both uplinks are to be used in single v-switch SW model.
When merged e-switch is supported, the per-port e-switch is logically
merged into one e-switch that spans both physical ports and all the VFs.

This model allows to offload TC eswitch rules between VFs belonging
to different PFs (and hence have different eswitch affinity), it also
sets the some of the foundations needed for uplink LAG support.

* tag 'mlx5-updates-2018-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  net/mlx5e: Explicitly set source e-switch in offloaded TC rules
  net/mlx5: Add source e-switch owner
  net/mlx5e: Explicitly set destination e-switch in FDB rules
  net/mlx5: Add destination e-switch owner
  net/mlx5: Properly handle a vport destination when setting FTE
  net/mlx5: Add merged e-switch cap
  IB/mlx5: Use 'kvfree()' for memory allocated by 'kvzalloc()'
  net/mlx5: Eswitch, Use 'kvfree()' for memory allocated by 'kvzalloc()'
  net/mlx5: Vport, Use 'kvfree()' for memory allocated by 'kvzalloc()'
2018-05-24 09:40:43 -06:00
Tom Lendacky 76cce0af85 amd-xgbe: Improve SFP 100Mbps auto-negotiation
After changing speed to 100Mbps as a result of auto-negotiation (AN),
some 10/100/1000Mbps SFPs indicate a successful link (no faults or loss
of signal), but cannot successfully transmit or receive data.  These
SFPs required an extra auto-negotiation (AN) after the speed change in
order to operate properly.  Add a quirk for these SFPs so that if the
outcome of the AN actually results in changing to a new speed, re-initiate
AN at that new speed.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 16:33:01 -04:00
Tom Lendacky e722ec8237 amd-xgbe: Update the BelFuse quirk to support SGMII
Instead of using a quirk to make the BelFuse 1GBT-SFP06 part look like
a 1000baseX part, program the SFP PHY to support SGMII and 10/100/1000
baseT.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 16:33:01 -04:00
Tom Lendacky 418746298e amd-xgbe: Advertise FEC support with the KR re-driver
When a KR re-driver is present, indicate the FEC support is available
during auto-negotiation.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 16:33:01 -04:00
Tom Lendacky eca282b841 amd-xgbe: Always attempt link training in KR mode
Link training is always attempted when in KR mode, but the code is
structured to check if link training has been enabled before attempting
to perform it.  Since that check will always be true, simplify the code
to always enable and start link training during KR auto-negotiation.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 16:33:00 -04:00
Tom Lendacky 01b5277fc9 amd-xgbe: Add ethtool show/set channels support
Add ethtool support to show and set the device channel configuration.
Changing the channel configuration will result in a device restart.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 16:33:00 -04:00
Tom Lendacky 2244753409 amd-xgbe: Prepare for ethtool set-channel support
In order to support being able to dynamically set/change the number of
Rx and Tx channels, update the code to:
 - Move alloc and free of device memory into callable functions
 - Move setting of the real number of Rx and Tx channels to device startup
 - Move mapping of the RSS channels to device startup

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 16:33:00 -04:00
Tom Lendacky bab748de98 amd-xgbe: Add ethtool show/set ring parameter support
Add ethtool support to show and set the number of the Rx and Tx ring
descriptors.  Changing the ring configuration will result in a device
restart.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 16:33:00 -04:00
Tom Lendacky 53a1024abf amd-xgbe: Add ethtool support to retrieve SFP module info
Add support to get SFP module information using ethtool.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 16:33:00 -04:00
Tom Lendacky 67cea0c922 amd-xgbe: Remove field that indicates SFP diagnostic support
The driver currently sets an indication of whether the SFP supports, and
that the driver can obtain, diagnostics data.  This isn't currently used
by the driver and the logic to set this indicator is flawed because the
field is cleared each time the SFP is checked and only set when a new SFP
is detected.  Remove this field and the logic supporting it.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 16:33:00 -04:00
Tom Lendacky 0d2b5255ea amd-xgbe: Remove use of comm_owned field
The comm_owned field can hide logic where double locking is attempted
and prevent multiple threads for the same device from accessing the
mutex properly.  Remove the comm_owned field and use the mutex API
exclusively for gaining ownership.  The current driver has been audited
and is obtaining communications ownership properly.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 16:33:00 -04:00
Tom Lendacky b93c3ab600 amd-xgbe: Read and save the port property registers during probe
Read and save the port property registers once during the device probe
and then use the saved values as they are needed.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 16:33:00 -04:00
Tom Lendacky 6c2799c11e amd-xgbe: Fix debug output of max channel counts
A debug output print statement uses the wrong variable to output the
maximum Rx channel count (cut and paste error, basically).  Fix the
statement to use the proper variable.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 16:33:00 -04:00
Ganesh Goudar 8156b0ba74 cxgb4: do L1 config when module is inserted
trigger an L1 configure operation when a transceiver module
is inserted in order to cause current "sticky" options like
Requested Forward Error Correction to be reapplied.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 16:00:56 -04:00
Ganesh Goudar 1d19023fa6 cxgb4: change the port capability bits definition
MDI Port Capabilities bit definitions were inconsistent with
regard to the MDI enum values. 2 bits used to define MDI in
the port capabilities are not really separable, it's a 2-bit
field with 4 different values. Change the port capability bit
definitions to be "AUTO" and "STRAIGHT" in order to get them
to line up with the enum's.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 16:00:56 -04:00
Jack Morgenstein d546b67cda net/mlx4: Fix irq-unsafe spinlock usage
spin_lock/unlock was used instead of spin_un/lock_irq
in a procedure used in process space, on a spinlock
which can be grabbed in an interrupt.

This caused the stack trace below to be displayed (on kernel
4.17.0-rc1 compiled with Lock Debugging enabled):

[  154.661474] WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
[  154.668909] 4.17.0-rc1-rdma_rc_mlx+ #3 Tainted: G          I
[  154.675856] -----------------------------------------------------
[  154.682706] modprobe/10159 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[  154.690254] 00000000f3b0e495 (&(&qp_table->lock)->rlock){+.+.}, at: mlx4_qp_remove+0x20/0x50 [mlx4_core]
[  154.700927]
and this task is already holding:
[  154.707461] 0000000094373b5d (&(&cq->lock)->rlock/1){....}, at: destroy_qp_common+0x111/0x560 [mlx4_ib]
[  154.718028] which would create a new lock dependency:
[  154.723705]  (&(&cq->lock)->rlock/1){....} -> (&(&qp_table->lock)->rlock){+.+.}
[  154.731922]
but this new dependency connects a SOFTIRQ-irq-safe lock:
[  154.740798]  (&(&cq->lock)->rlock){..-.}
[  154.740800]
... which became SOFTIRQ-irq-safe at:
[  154.752163]   _raw_spin_lock_irqsave+0x3e/0x50
[  154.757163]   mlx4_ib_poll_cq+0x36/0x900 [mlx4_ib]
[  154.762554]   ipoib_tx_poll+0x4a/0xf0 [ib_ipoib]
...
to a SOFTIRQ-irq-unsafe lock:
[  154.815603]  (&(&qp_table->lock)->rlock){+.+.}
[  154.815604]
... which became SOFTIRQ-irq-unsafe at:
[  154.827718] ...
[  154.827720]   _raw_spin_lock+0x35/0x50
[  154.833912]   mlx4_qp_lookup+0x1e/0x50 [mlx4_core]
[  154.839302]   mlx4_flow_attach+0x3f/0x3d0 [mlx4_core]

Since mlx4_qp_lookup() is called only in process space, we can
simply replace the spin_un/lock calls with spin_un/lock_irq calls.

Fixes: 6dc06c08be ("net/mlx4: Fix the check in attaching steering rules")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 15:48:58 -04:00
Ganesh Goudar 038ed45bce cxgb4: Add new T6 device ids
Add 0x6088 and 0x6089 device ids for new T6 cards.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 15:28:22 -04:00
Colin Ian King 4f7f56b6b1 net/mlx4: fix spelling mistake: "Inrerface" -> "Interface" and rephrase message
Trivial fix to spelling mistake in mlx4_dbg debug message and also
change the phrasing of the message so that is is more readable

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 14:58:10 -04:00
Nathan Fontenot 73f9d36440 ibmvnic: Only do H_EOI for mobility events
When enabling the sub-CRQ IRQ a previous update sent a H_EOI prior
to the enablement to clear any pending interrupts that may be present
across a partition migration. This fixed a firmware bug where a
migration could erroneously indicate that a H_EOI was pending.

The H_EOI should only be sent when enabling during a mobility
event though. Doing so at other time could wrong and can produce
extra driver output when IRQs are enabled when doing TX completion.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 14:54:11 -04:00
Colin Ian King 7c6f97475d net: vxge: fix spelling mistake in macro VXGE_HW_ERR_PRIVILAGED_OPEARATION
Rename VXGE_HW_ERR_PRIVILAGED_OPEARATION to VXGE_HW_ERR_PRIVILEGED_OPERATION
to fix spelling mistake.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 14:50:02 -04:00
Jakub Kicinski 51c1df83e3 nfp: assign vNIC id as phys_port_name of vNICs which are not ports
When NFP is modelled as a switch we assign phys_port_name to respective
port(representor )s:

 vNIC0 - | - PF port (pf%d)     MAC/PHY (p%d[s%d]) - |E==

In most cases there is only one vNIC for communication with the switch.
If there is more than one we need to be able to identify them.  Use %d
as phys_port_name of the vNICs.

We don't have to pass ID to nfp_net_debugfs_vnic_add() separately any
more.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 14:26:19 -04:00
Jakub Kicinski 290f54db31 nfp: use split in naming of PCIe PF ports
PCI PFs can host more than one logical endpoint.  In NFP terms
this means having more than one vNIC for PCIe PF.  The vNICs
are usually corresponding 1:1 to Ethernet ports.  In core NIC
we use the legacy idea of vNIC *being* the Ethernet port,
hence netdevs put pX(sY) in their phys_port_name, like Ethernet
ports would.  When ASIC ports are fully represented we need to
be able to name different PCIe PF ports, too.  Use a scheme
similar to Ethernet ports - pfXsY, for PCIe PF number X,
sub-port Y.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 14:26:19 -04:00
Jakub Kicinski 1f70036723 nfp: abm: force Ethternet port up
Current control firmware does not cater too well to multi-host
applications.  There is no way to check which hosts are up or
otherwise negotiate what the state of the external port (the
Ethernet port) should be.  Make sure the link is up when driver
loads, and don't take it down when Ethernet port netdev is
closed.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 14:26:19 -04:00
Jakub Kicinski d05d902eda nfp: abm: spawn port netdevs
To configure buffering points we need full set of netdevs:

                              ASIC

 user netdev  -- | -- PCIe port   MAC port -- | --

Configuring egrees qdiscs on user netdev configures standard
Linux TC software qdiscs, configuring PCIe port qdiscs will
provide a way of setting ASIC queuing parameters for PCIe block.
MAC port netdev egress qdiscs correspond to ASIC MAC Traffic
Manager block.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 14:26:19 -04:00
Jakub Kicinski 4afa3af418 nfp: add devlink_eswitch_mode_set callback
Our previous apps all assumed to use only one eswitch mode (legacy
or switchdev) without the ability to change it.  ABM NIC will
want to support the switch so plumb devlink_eswitch_mode_set through.
The devlink_eswitch_mode_set is expected to spawn representors and
potentially devlink ports so it's called under big devlink lock and
pf->lock.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 14:26:19 -04:00
Jakub Kicinski 634c6b7a85 nfp: add app pointer to port representors
nfp_apps can currently associate their structures with vNICs but
not representors.  Add app priv pointer to representors as well.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 14:26:18 -04:00
Jakub Kicinski cc54dc2804 nfp: abm: create project-specific vNIC structure
ABM NIC requires more complex vNIC handling, allocate
per-vNIC structure.  Find out RX queue base and PCI PF id.
There will be multiple PFs sharing the same MAC port, therefore
the MAC address assigned to the vNIC must be looked up in the
HWInfo database.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 14:26:18 -04:00
Jakub Kicinski c4c8f39a57 nfp: abm: add initial active buffer management NIC skeleton
Add a very rudimentary active buffer management NIC support.
For now it's like a core NIC without SR-IOV support.  Next
commits will extend its functionality.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 14:26:18 -04:00
Jakub Kicinski b586c77b3c nfp: core: allow 4-byte aligned accesses to Memory Units
Current code doesn't enforce length requirements on 32bit accesses
with action NFP_CPP_ACTION_RW to memory units, but if the access
is only aligned to 4 bytes as well we will fall into the explicit
access case and error out.  Such accesses are correct, allow them
by lowering the width earlier.

While at it use a switch statement to improve readability.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 14:26:18 -04:00
Jakub Kicinski a0d163f432 nfp: add shared buffer configuration
Allow app FW to advertise its shared buffer pool information.
Use the per-PF mailbox to configure them from devlink.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 14:26:18 -04:00
Jakub Kicinski 0c693323a1 nfp: add support for per-PCI PF mailbox
When working with devlink-related functionality for locking reasons
it's easier to create a new mailbox per-PCI PF device than try to
use one of the netdev/vNIC mailboxes.

Define new mailbox structure and resolve its symbol during probe.
For forward compatibility allow silent truncation of mailbox command
data.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 14:26:18 -04:00
Jakub Kicinski 8f6196f63c nfp: move rtsym helpers to pf code
nfp_net_pf_rtsym_read_optional() and nfp_net_pf_map_rtsym() are not
really related to networking code.  Move them to the PF code and
remove the net from their names.  They will soon be needed by code
outside of nfp_net_main.c anyway.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 14:26:18 -04:00
Sudarsana Reddy Kalluru d25b859ccd qede: Add support for populating ethernet TLVs.
This patch adds callbacks for providing the ethernet protocol driver TLVs.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 23:29:54 -04:00
Sudarsana Reddy Kalluru 59ccf86fe6 qed: Add driver infrastucture for handling mfw requests.
MFW requests the TLVs in interrupt context. Extracting of the required
data from upper layers and populating of the TLVs require process context.
The patch adds work-queues for processing the tlv requests. It also adds
the implementation for requesting the tlv values from appropriate protocol
driver.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 23:29:54 -04:00
Sudarsana Reddy Kalluru 77a509e4f6 qed: Add support for processing iscsi tlv request.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 23:29:54 -04:00
Sudarsana Reddy Kalluru f240b68822 qed: Add support for processing fcoe tlv request.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 23:29:53 -04:00
Sudarsana Reddy Kalluru 2528c38993 qed: Add support for tlv request processing.
The patch adds driver support for processing TLV requests/repsonses
from the mfw and upper driver layers respectively. The implementation
reads the requested TLVs from the shared memory, requests the values
from upper layer drivers, populates this info (TLVs) shared memory and
notifies MFW about the TLV values.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 23:29:53 -04:00
Sudarsana Reddy Kalluru dd006921d6 qed: Add MFW interfaces for TLV request support.
The patch adds required management firmware (MFW) interfaces such as
mailbox commands, TLV types etc.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 23:29:53 -04:00
David S. Miller 9c803cfd5f Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2018-05-22

This series contains updates to i40e only.

Jake provides all the changes in this series starting with making it
consistent in how we approach the bit lock.  Fixed the reporting of the
VEB statistics and the queue statistics to always return every queue
even if it is not currently in use.  Use WARN_ONCE() so that the first
time we end up with an incorrect size we will dump a stack trace and a
message to help highlight the issue early in testing.  Folded the fixed
string prefix into the stat string definition.  Instead of using a
separate char *p pointer when copying strings, use the data pointer
directly.  Added code comments for several of the statistic functions to
better explain the number and ordering of statistics.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 15:45:06 -04:00
Bo Chen d7db318651 pcnet32: add an error handling path in pcnet32_probe_pci()
Make sure to invoke pci_disable_device() when errors occur in
pcnet32_probe_pci().

Signed-off-by: Bo Chen <chenbo@pdx.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 15:40:15 -04:00
Shahed Shaikh fdd13dd350 qed: Fix mask for physical address in ILT entry
ILT entry requires 12 bit right shifted physical address.
Existing mask for ILT entry of physical address i.e.
ILT_ENTRY_PHY_ADDR_MASK is not sufficient to handle 64bit
address because upper 8 bits of 64 bit address were getting
masked which resulted in completer abort error on
PCIe bus due to invalid address.

Fix that mask to handle 64bit physical address.

Fixes: fe56b9e6a8 ("qed: Add module with basic common support")
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 15:32:55 -04:00
David Ahern 5a15a1b07c mlxsw: spectrum_router: Add support for route append
Handle append for gateway based routes. Dev-only multipath routes will
be handled by a follow on patch.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 14:44:18 -04:00
Fabio Estevam 1f508124e9 net: fec: Add a SPDX identifier
Currently there is no license information in the header of
this file.

The MODULE_LICENSE field contains ("GPL"), which means
GNU Public License v2 or later, so add a corresponding
SPDX license identifier.

Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 13:42:05 -04:00
Fabio Estevam 9fcca5effc net: fec: ptp: Switch to SPDX identifier
Adopt the SPDX license identifier headers to ease license compliance
management.

Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 13:42:04 -04:00
Jacob Keller 3f76bdb437 i40e: use the more traditional 'i' loop variable
Since we no longer use i as an array index for the data variable,
replace the use of 'j' with 'i' so that we match the general loop
variable name.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-22 08:37:06 -07:00
Jacob Keller ec29bbf8ab i40e: add function doc headers for ethtool stats functions
Add documentation for the i40e_get_stats_count, i40e_get_stat_strings
and i40e_get_ethtool_stats explaining that the number and ordering of
statistics must remain constant for a given netdevice.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-22 08:37:06 -07:00
Jacob Keller e08696dcd9 i40e: update data pointer directly when copying to the buffer
A future patch is going to add a helper function i40e_add_ethtool_stats
that will help lower the amount of boiler plate code in the
i40e_get_ethtool_stats function.

This conversion will take place over many patches, and the helper
function will work by directly updating a reference to the data pointer.

Since this would not work combined with the current method of accessing
data like an array, update all the code that copies stats into the data
buffer to use direct updates to the pointer instead of array accesses.

This will prevent incorrect stat updates for patches in between the
conversion.

Similarly, when copying strings, we used a separate char *p pointer.
Instead, use the data pointer directly as it's already a (u8 *) type
which is the same size.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-22 08:37:06 -07:00
Jacob Keller bf1c39e640 i40e: fold prefix strings directly into stat names
We always prefix these stats with a fixed string, so just fold this
prefix into the stat string definition. This preparatory work will make
it easier to implement a helper function to copy stats and strings into
the supplied buffers in a future patch.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-22 08:37:06 -07:00
Jacob Keller 9b10df596b i40e: use WARN_ONCE to replace the commented BUG_ON size check
We don't really want to use BUG_ON here since that would completely
crash the kernel, thus the reason we commented it out. We *can't* use
BUILD_BUG_ON because at least now (a) the sizes aren't constant (we are
fixing this) and (b) not all compilers are smart enough to understand
that "p - data" is a constant.

Instead, just use a WARN_ONCE so that the first time we end up with an
incorrect size we will dump a stack trace and a message, hopefully
highlighting the issues early in testing.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-22 08:37:06 -07:00
Jacob Keller 019b9cd44d i40e: split i40e_get_strings() into smaller functions
Split the statistic strings and private flags strings into their own
separate functions to aid code readability.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-22 08:37:06 -07:00
Jacob Keller b831236509 i40e: always return all queue stat strings
The ethtool API for obtaining device statistics is not intended to allow
runtime changes in the number of statistics reported. It may *appear*
this way, as there is an ability to request the number of stats using
ethtool_get_set_count(). However, it is expected that this must always
return the same value for invocations of the same device.

If we don't satisfy this contract, and allow the number of stats to
change during run time, we could cause invalid memory accesses or report
the stat strings incorrectly. This is because the API for obtaining
stats is to (1) get the size, (2) get the strings and finally (3) get
the stats. Since these are each separate ethtool op commands, it is not
possible to maintain consistency by holding the RTNL lock over the whole
operation. This results in the potential for a race condition to occur
where the size changed between any of the 3 calls.

Avoid this issue by requiring that we always return the same value for
a given device. We can check any values which remain constant for the
life of the device, but must not report different sizes depending on
runtime attributes.

This patch specifically fixes the queue statistics to always return
every queue even if it's not currently in use.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-22 08:37:06 -07:00
Jacob Keller 9955d4940e i40e: always return VEB stat strings
The ethtool API for obtaining device statistics is not intended to allow
runtime changes in the number of statistics reported. It may *appear*
this way, as there is an ability to request the number of stats using
ethtool_get_set_count(). However, it is expected that this must always
return the same value for invocations of the same device.

If we don't satisfy this contract, and allow the number of stats to
change during run time, we could cause invalid memory accesses or report
the stat strings incorrectly. This is because the API for obtaining
stats is to (1) get the size, (2) get the strings and finally (3) get
the stats. Since these are each separate ethtool op commands, it is not
possible to maintain consistency by holding the RTNL lock over the whole
operation. This results in the potential for a race condition to occur
where the size changed between any of the 3 calls.

Avoid this issue by requiring that we always return the same value for
a given device. We can check any values which remain constant for the
life of the device, but must not report different sizes depending on
runtime attributes.

This patch specifically fixes the VEB statistics strings to always be
reported. Other issues will be fixed in future patches.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-22 08:37:06 -07:00
Jacob Keller bdf2752305 i40e: free skb after clearing lock in ptp_stop
Use the same logic to free the skb after clearing the Tx timestamp bit
lock in i40e_ptp_stop as we use in the other locations. It is not as
important here since we are not racing against a future Tx timestamp
request (as we are disabling PTP at this point). However it is good to
be consistent in how we approach the bit lock so that future callers
don't copy the old anti-pattern.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-22 08:37:06 -07:00
Florian Fainelli c79c385044 ti: ethernet: davinci: Fix cast to int warnings
Now that we can compile test this driver on 64-bit hosts, we get some
warnings about how a pointer/address is written/read to/from a register
(sw_token). Fix this by doing the appropriate conversions, we cannot
possibly have the driver work on 64-bit hosts the way the tokens are
managed though, since the registers being written to a 32-bit only.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-21 16:17:10 -04:00
Florian Fainelli 5a04e8f81a net: ethernet: davinci_emac: Fix printing of base address
Use %pa which is the correct formatter to print a physical address,
instead of %p which is just a pointer.

Fixes: a6286ee630 ("net: Add TI DaVinci EMAC driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-21 16:17:10 -04:00
Florian Fainelli bf2ce3fdf3 net: ethernet: ti: cpsw: Fix cpsw_add_ch_strings() printk format
When building on a 64-bit host we will get the following warning:

drivers/net/ethernet/ti/cpsw.c: In function 'cpsw_add_ch_strings':
drivers/net/ethernet/ti/cpsw.c:1284:19: warning: format '%d' expects
argument of type 'int', but argument 5 has type 'long unsigned int'
[-Wformat=]
     "%s DMA chan %d: %s", rx_dir ? "Rx" : "Tx",
                  ~^
                  %ld

Fix this by using an %ld format and casting to long.

Fixes: e05107e6b7 ("net: ethernet: ti: cpsw: add multi queue support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-21 16:17:10 -04:00
Florian Fainelli ea5ec9fc9e net: ethernet: ti: cpts: Fix timestamp print
On 64-bit hosts we will get the following warning:

drivers/net/ethernet/ti/cpts.c: In function 'cpts_overflow_check':
drivers/net/ethernet/ti/cpts.c:297:11: warning: format '%lld' expects
argument of type 'long long int', but argument 3 has type
'__kernel_time_t {aka long int}' [-Wformat=]
  pr_debug("cpts overflow check at %lld.%09lu\n", ts.tv_sec,
ts.tv_nsec);

Fix this by using an appropriate casting that works on all bit sizes.

Fixes: a5c79c26e1 ("ptp: cpts: convert to the 64 bit get/set time methods.")
Fixes: 87c0e764d4 ("cpts: introduce time stamping code and a PTP hardware clock.")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-21 16:17:10 -04:00
Florian Fainelli b4eb739368 ti: ethernet: cpdma: Use correct format for genpool_*
Now that we can compile davinci_cpdma.c on 64-bit hosts, we can see that
the format used for printing a size_t type is incorrect, use %zd
accordingly.

Fixes: aeec302104 ("net: ethernet: ti: cpdma: remove used_desc counter")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-21 16:17:09 -04:00
David S. Miller 6f6e434aa2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
S390 bpf_jit.S is removed in net-next and had changes in 'net',
since that code isn't used any more take the removal.

TLS data structures split the TX and RX components in 'net-next',
put the new struct members from the bug fix in 'net' into the RX
part.

The 'net-next' tree had some reworking of how the ERSPAN code works in
the GRE tunneling code, overlapping with a one-line headroom
calculation fix in 'net'.

Overlapping changes in __sock_map_ctx_update_elem(), keep the bits
that read the prog members via READ_ONCE() into local variables
before using them.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-21 16:01:54 -04:00
Ganesh Goudar a6076fcd18 cxgb4: copy the length of cpl_tx_pkt_core to fw_wr
immdlen field of FW_ETH_TX_PKT_WR is filled in a wrong way,
we must copy the length of all the cpls encapsulated in fw
work request. In the xmit path we missed adding the length
of CPL_TX_PKT_CORE but we added the length of WR_HDR and it
worked because WR_HDR and CPL_TX_PKT_CORE are of same length.
Add the length of cpl_tx_pkt_core not WR_HDR's. This also
fixes the lso cpl errors for udp tunnels

Fixes: d0a1299c6b ("cxgb4: add support for vxlan segmentation offload")
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-21 12:16:23 -04:00
Florian Fainelli 6c541b4595 net: ethernet: Sort Kconfig sourcing alphabetically
A number of entries were not alphabetically sorted, remedy that.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-21 12:14:47 -04:00
Sergei Shtylyov 6b14787ab1 sh_eth: fix typo in comment to BCULR write
Simon has noticed a typo in the comment accompaining the BCULR write --
fix it and move the comment before the write (following the style of
the other comments), while at it...

Reported-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:56:43 -04:00
Sergei Shtylyov 9c59c9a8e9 sh_eth: fix comment grammar in 'struct sh_eth_cpu_data'
All the verbs in the comments to the 'struct sh_eth_cpu_data' declaration
should  be in a  3rd person singular, to match the nouns.

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:56:42 -04:00
Sergei Shtylyov 27164491cb sh_eth: fix typo in EESR.TRO bit name
The  correct name of the EESR bit 8 is TRO (transmit retry over), not RTO.
Note that EESIPR bit 8, TROIP remained correct...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:56:42 -04:00
Yunsheng Lin eddf04626d net: hns3: Fix for CMDQ and Misc. interrupt init order problem
When vf module is loading, the cmd queue initialization should
happen before misc interrupt initialization, otherwise the misc
interrupt handle will cause using uninitialized cmd queue problem.
There is also the same issue when vf module is unloading.

This patch fixes it by adjusting the location of some function.

Fixes: e2cb1dec97 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:53:59 -04:00
Xi Wang 7bb572fddd net: hns3: Fixes kernel panic issue during rmmod hns3 driver
If CONFIG_ARM_SMMU_V3 is enabled, arm64's dma_ops will replace
arm64_swiotlb_dma_ops with iommu_dma_ops. When releasing contiguous
dma memory, the new ops will call the vunmap function which cannot
be run in interrupt context.

Currently, spin_lock_bh is called before vunmap is executed. This
disables BH and causes the interrupt context to be detected to
generate a kernel panic like below:

[ 2831.573400] kernel BUG at mm/vmalloc.c:1621!
[ 2831.577659] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
...
[ 2831.699907] Process rmmod (pid: 1893, stack limit = 0x0000000055103ee2)
[ 2831.706507] Call trace:
[ 2831.708941]  vunmap+0x48/0x50
[ 2831.711897]  dma_common_free_remap+0x78/0x88
[ 2831.716155]  __iommu_free_attrs+0xa8/0x1c0
[ 2831.720255]  hclge_free_cmd_desc+0xc8/0x118 [hclge]
[ 2831.725128]  hclge_destroy_cmd_queue+0x34/0x68 [hclge]
[ 2831.730261]  hclge_uninit_ae_dev+0x90/0x100 [hclge]
[ 2831.735127]  hnae3_unregister_ae_dev+0xb0/0x868 [hnae3]
[ 2831.740345]  hns3_remove+0x3c/0x90 [hns3]
[ 2831.744344]  pci_device_remove+0x48/0x108
[ 2831.748342]  device_release_driver_internal+0x164/0x200
[ 2831.753553]  driver_detach+0x4c/0x88
[ 2831.757116]  bus_remove_driver+0x60/0xc0
[ 2831.761026]  driver_unregister+0x34/0x60
[ 2831.764935]  pci_unregister_driver+0x30/0xb0
[ 2831.769197]  hns3_exit_module+0x10/0x978 [hns3]
[ 2831.773715]  SyS_delete_module+0x1f8/0x248
[ 2831.777799]  el0_svc_naked+0x30/0x34

This patch fixes it by using spin_lock instead of spin_lock_bh.

Fixes: 68c0a5c706 ("net: hns3: Add HNS3 IMP(Integrated Mgmt Proc) Cmd Interface Support")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:53:59 -04:00
Fuyun Liang f30dfddcca net: hns3: Fix for netdev not running problem after calling net_stop and net_open
The link status update function is called by timer every second. But
net_stop and net_open may be called with very short intervals. The link
status update function can not detect the link state has changed. It
causes the netdev not running problem.

This patch fixes it by updating the link state in ae_stop function.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:53:59 -04:00
Huazhong Tan 42687543ca net: hns3: Use enums instead of magic number in hclge_is_special_opcode
This patch does bit of a clean-up by using already defined enums for
certain values in function hclge_is_special_opcode(). Below enums from
have been used as replacements for magic values:

enum hclge_opcode_type{
	<snip>
	HCLGE_OPC_STATS_64_BIT		= 0x0030,
	HCLGE_OPC_STATS_32_BIT		= 0x0031,
	HCLGE_OPC_STATS_MAC		= 0x0032,
	<snip>
};

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:53:59 -04:00
Xi Wang 3c7624d8fc net: hns3: Fix for hns3 module is loaded multiple times problem
If the hns3 driver has been built into kernel and then loaded with
the same driver which built as KLM, it may trigger an error like
below:

[   20.009555] hns3: Hisilicon Ethernet Network Driver for Hip08 Family - version
[   20.016789] hns3: Copyright (c) 2017 Huawei Corporation.
[   20.022100] Error: Driver 'hns3' is already registered, aborting...
[   23.517397] Unable to handle kernel NULL pointer dereference at virtual address 00000000
...
[   23.691583] Process insmod (pid: 1982, stack limit = 0x00000000cd5f21cb)
[   23.698270] Call trace:
[   23.700705]  __list_del_entry_valid+0x2c/0xd8
[   23.705049]  hnae3_unregister_client+0x68/0xa8
[   23.709487]  hns3_init_module+0x98/0x1000 [hns3]
[   23.714093]  do_one_initcall+0x5c/0x170
[   23.717918]  do_init_module+0x64/0x1f4
[   23.721654]  load_module+0x1d14/0x24b0
[   23.725390]  SyS_init_module+0x158/0x208
[   23.729300]  el0_svc_naked+0x30/0x34

This patch fixes it by adding module version info.

Fixes: 38caee9d3e ("net: hns3: Add support of the HNAE3 framework")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:53:59 -04:00
Xi Wang 13562d1f5e net: hns3: Fix the missing client list node initialization
This patch fixes the missing initialization of the client list node
in the hnae3_register_client() function.

Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:53:59 -04:00
Jian Shen 99a6993a69 net: hns3: cleanup of return values in hclge_init_client_instance()
Removes the goto and directly returns in case of errors as part of the
cleanup.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:53:59 -04:00
Peng Li e63cd65f43 net: hns3: Fixes API to fetch ethernet header length with kernel default
During the RX leg driver needs to fetch the ethernet header
length from the RX'ed Buffer Descriptor. Currently, proprietary
version hns3_nic_get_headlen is being used to fetch the header
length which uses l234info present in the Buffer Descriptor
which might not be valid for the first Buffer Descriptor if the
packet is spanning across multiple descriptors.
Kernel default eth_get_headlen API does the job correctly.

Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Peng Li <lipeng321@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:53:58 -04:00
Salil Mehta 743e1a8438 net: hns3: Fixes error reported by Kbuild and internal review
This patch fixes the error reported by Intel's kbuild and fixes a
return value in one of the legs, caught during review of the original
patch sent by kbuild.

Fixes: fdb793670a00 ("net: hns3: Add support of .sriov_configure in HNS3 driver")
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:53:58 -04:00
Heiner Kallweit 52f8560ee4 r8169: fix network error on resume from suspend
This commit removed calls to rtl_set_rx_mode(). This is ok for the
standard path if the link is brought up, however it breaks system
resume from suspend. Link comes up but no network traffic.

Meanwhile common code from rtl_hw_start_8169/8101/8168() was moved
to rtl_hw_start(), therefore re-add the call to rtl_set_rx_mode()
there.

Due to adding this call we have to move definition of rtl_hw_start()
after definition of rtl_set_rx_mode().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Fixes: 82d3ff6dd1 ("r8169: remove calls to rtl_set_rx_mode")
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:38:47 -04:00
Jeff Kirsher f0b99e3ab3 Revert "ixgbe: release lock for the duration of ixgbe_suspend_close()"
This reverts commit 6710f970d9.

Gotta love when developers have offline discussions, thinking everyone
is reading their responses/dialog.

The change had the potential for a number of race conditions on
shutdown, which is why we are reverting the change.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:23:07 -04:00
Hemanth Puranik 1bc49fd18b net: qcom/emac: Allocate buffers from local node
Currently we use non-NUMA aware allocation for TPD and RRD buffers,
this patch modifies to use NUMA friendly allocation.

Signed-off-by: Hemanth Puranik <hpuranik@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:22:35 -04:00
Sergei Shtylyov 3eb9c2ad1d sh_eth: add R8A77980 support
Finally, add support for the DT probing of the R-Car V3H (AKA R8A77980) --
it's the only R-Car gen3 SoC having the GEther controller -- others have
only EtherAVB...

Based on the original (and large) patch by Vladimir Barinov.

Signed-off-by: Vladimir Barinov <vladimir.barinov@cogentembedded.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-19 23:24:46 -04:00
Sergei Shtylyov 93f0fa7519 sh_eth: add EDMR.NBST support
The R-Car V3H (AKA R8A77980) GEther controller adds the DMA burst mode bit
(NBST) in EDMR and the manual tells to always set it before doing any DMA.

Based on the original (and large) patch by Vladimir Barinov.

Signed-off-by: Vladimir Barinov <vladimir.barinov@cogentembedded.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-19 23:24:46 -04:00
Sergei Shtylyov 230c184679 sh_eth: add RGMII support
The R-Car V3H (AKA R8A77980) GEther controller  adds support for the RGMII
PHY interface mode as a new  value  for the RMII_MII register.

Based on the original (and large) patch by Vladimir Barinov.

Signed-off-by: Vladimir Barinov <vladimir.barinov@cogentembedded.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-19 23:24:46 -04:00
Maxime Chevallier 62c8a069b5 net: mvpp2: Add missing VLAN tag detection
Marvell PPv2 Header Parser sets some bits in the 'result_info' field in
each lookup iteration, to identify different packet attributes such as
DSA / VLAN tag, protocol infos, etc. This is used in further
classification stages in the controller.

It's the DSA tag detection entry that is in charge of detecting when there
is a single VLAN tag.

This commits adds the missing update of the result_info in this case.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-19 22:55:32 -04:00
Jiri Pirko ec932fbda7 mlxsw: use devlink helper to generate physical port name
Since devlink knows the info needed to generate the physical port name
in a generic way for all devlink users, use the helper to do the job.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-19 16:30:39 -04:00
Jiri Pirko 5ec1380a21 devlink: extend attrs_set for setting port flavours
Devlink ports can have specific flavour according to the purpose of use.
This patch extend attrs_set so the driver can say which flavour port
has. Initial flavours are:
physical, cpu, dsa
User can query this to see right away what is the purpose of each port.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-19 16:30:39 -04:00
Jiri Pirko b9ffcbaf56 devlink: introduce devlink_port_attrs_set
Change existing setter for split port information into more generic
attrs setter. Alongside with that, allow to set port number and subport
number for split ports.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-19 16:30:39 -04:00
Jiong Wang c217abccaa nfp: bpf: support arithmetic indirect right shift (BPF_ARSH | BPF_X)
Code logic is similar with arithmetic right shift by constant, and NFP
get indirect shift amount through source A operand of PREV_ALU.

It is possible to fall back to logic right shift if the MSB is known to be
zero from range info, however there is no benefit to do this given logic
indirect right shift use the same number and cycle of instruction sequence.

Suppose the MSB of regX is the bit we want to replicate to fill in all the
vacant positions, and regY contains the shift amount, then we could use
single instruction to set up both.

  [alu, --, regY, OR, regX]

  --
  NOTE: the PREV_ALU result doesn't need to write to any destination
        register.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-18 21:35:55 +02:00
Jiong Wang f43d0f17fe nfp: bpf: support arithmetic right shift by constant (BPF_ARSH | BPF_K)
Code logic is similar with logic right shift except we also need to set
PREV_ALU result properly, the MSB of which is the bit that will be
replicated to fill in all the vacant positions.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-18 21:35:55 +02:00
Jiong Wang 991f5b3651 nfp: bpf: support logic indirect shifts (BPF_[L|R]SH | BPF_X)
For indirect shifts, shift amount is not specified as constant, NFP needs
to get the shift amount through the low 5 bits of source A operand in
PREV_ALU, therefore extra instructions are needed compared with shifts by
constants.

Because NFP is 32-bit, so we are using register pair for 64-bit shifts and
therefore would need different instruction sequences depending on whether
shift amount is less than 32 or not.

NFP branch-on-bit-test instruction emitter is added by this patch and is
used for efficient runtime check on shift amount. We'd think the shift
amount is less than 32 if bit 5 is clear and greater or equal than 32
otherwise. Shift amount is greater than or equal to 64 will result in
undefined behavior.

This patch also use range info to avoid generating unnecessary runtime code
if we are certain shift amount is less than 32 or not.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-18 21:35:54 +02:00
Jose Abreu eb38401c77 net: stmmac: Populate missing callbacks in HWIF initialization
Some HW specific setups, like sun8i, do not populate all the necessary
callbacks, which is what HWIF helpers were expecting.

Fix this by always trying to get the generic helpers and populate them
if they were not previously populated by HW specific setup.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Fixes: 5f0456b431 ("net: stmmac: Implement logic to automatically
select HW Interface")
Reported-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Tested-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Cc: Corentin Labbe <clabbe.montjoie@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:56:08 -04:00
Rahul Lakkireddy d775f26b29 cxgb4: fix offset in collecting TX rate limit info
Correct the indirect register offsets in collecting TX rate limit info
in UP CIM logs.

Also, T5 doesn't support these indirect register offsets, so remove
them from collection logic.

Fixes: be6e36d916 ("cxgb4: collect TX rate limit info in UP CIM logs")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:54:48 -04:00
Rahul Lakkireddy 80a95a80d3 cxgb4: collect SGE PF/VF queue map
For T6, collect info on queue mapping to corresponding PF/VF in SGE.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:54:10 -04:00
Antoine Tenart a3302baa2c net: mvpp2: typo and cosmetic fixes
This patch on the Marvell PPv2 driver is only cosmetic. Two typos are
removed as well as other cosmetic fixes, such as extra new lines or tabs
vs spaces.

Suggested-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:48:08 -04:00
Geert Uytterhoeven b16a960ddb sh_eth: Change platform check to CONFIG_ARCH_RENESAS
Since commit 9b5ba0df4e ("ARM: shmobile: Introduce ARCH_RENESAS")
is CONFIG_ARCH_RENESAS a more appropriate platform check than the legacy
CONFIG_ARCH_SHMOBILE, hence use the former.

Renesas SuperH SH-Mobile SoCs are still covered by the CONFIG_CPU_SH4
check.

This will allow to drop ARCH_SHMOBILE on ARM and ARM64 in the near
future.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:45:48 -04:00
David S. Miller d6830519a9 mlx5e-updates-2018-05-17
From: Or Gerlitz <ogerlitz@mellanox.com>
 
 This series addresses a regression introduced by the
 shared block TC changes [1]. Currently, for VF->VF and uplink->VF rules, the
 TC core (cls_api) attempts to offload the same flow multiple times into
 the driver, as a side effect of the mlx5 registration to the egdev callback.
 
 We use the flow cookie to ignore attempts to add such flows, we can't
 reject them (return error), b/c this will fail the offload attempt, so we
 ignore that.
 
 The last patch of the series deals with exposing HW stats counters through
 ethtool for the vport reps.
 
 Dave - the regression that we are addressing was introduced in 4.15 [1] and applies
 to nfp and mlx5. Jiri suggested to push driver side fixes to net-next, this is
 already done for nfp [2][3]. Once this is upstream, we will submit a small/point
 single patch fix for the TC core code which can serve for net and stable, but not
 carried into net-next, b/c it might limit some future use-cases.
 
 [1] 208c0f4b52 "net: sched: use tc_setup_cb_call to call per-block callbacks"
 [2] c50647d "nfp: flower: ignore duplicate cb requests for same rule"
 [3] 54a4a03 "nfp: flower: support offloading multiple rules with same cookie"
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJa/iMHAAoJEEg/ir3gV/o+6YcIAMIcUmH0Yga2CQLl1VGZr4v7
 5Yo5z8upZ2pKVlBNtgeDonIckcFNbtPaUC7xTolFmOks6DgoGKKLvrIyq3tNG+42
 ShF91BmLvrpl/+8GmsjNf5qvsmc6piOHfBknlaIl7XeoaKLMfy4ts7/Cryt0U24k
 meE/zu7slOOam6H2RyXKLsJa0uP/SpxrCq1OAlAwhmVe60p9SRCVkutmRqU47OuO
 Oc3XGtYbU3IVa3B0bdi+SdOyF1RykCH3PKSrChy2WhdfpSp29I+gydfWMX8/3+z4
 mH3/LDi4CAoHiCqnUr3s5h6zGuYqwcpSYY3tUPUD3A48LnKjT70LJF85F/v7exY=
 =QeXI
 -----END PGP SIGNATURE-----

Merge tag 'mlx5e-updates-2018-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5e-updates-2018-05-17

From: Or Gerlitz <ogerlitz@mellanox.com>

This series addresses a regression introduced by the
shared block TC changes [1]. Currently, for VF->VF and uplink->VF rules, the
TC core (cls_api) attempts to offload the same flow multiple times into
the driver, as a side effect of the mlx5 registration to the egdev callback.

We use the flow cookie to ignore attempts to add such flows, we can't
reject them (return error), b/c this will fail the offload attempt, so we
ignore that.

The last patch of the series deals with exposing HW stats counters through
ethtool for the vport reps.

Dave - the regression that we are addressing was introduced in 4.15 [1] and applies
to nfp and mlx5. Jiri suggested to push driver side fixes to net-next, this is
already done for nfp [2][3]. Once this is upstream, we will submit a small/point
single patch fix for the TC core code which can serve for net and stable, but not
carried into net-next, b/c it might limit some future use-cases.

[1] 208c0f4b52 "net: sched: use tc_setup_cb_call to call per-block callbacks"
[2] c50647d "nfp: flower: ignore duplicate cb requests for same rule"
[3] 54a4a03 "nfp: flower: support offloading multiple rules with same cookie"
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:00:43 -04:00
David S. Miller 3888ea4e2f mlx5-updates-2018-05-17
mlx5 core dirver updates for both net-next and rdma-next branches.
 
 From Christophe JAILLET, first three patche to use kvfree where needed.
 
 From: Or Gerlitz <ogerlitz@mellanox.com>
 
 Next six patches from Roi and Co adds support for merged
 sriov e-switch which comes to serve cases where both PFs, VFs set
 on them and both uplinks are to be used in single v-switch SW model.
 When merged e-switch is supported, the per-port e-switch is logically
 merged into one e-switch that spans both physical ports and all the VFs.
 
 This model allows to offload TC eswitch rules between VFs belonging
 to different PFs (and hence have different eswitch affinity), it also
 sets the some of the foundations needed for uplink LAG support.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJa/fLEAAoJEEg/ir3gV/o+7jUH/3n5/Uw1LLt3TfeKArx6i0F1
 3G4U5B0ha03qiDqXprwhyQ3I6lgYmRBmjcxnqmvcqOAqO4/hSsjtTR+A/mgbEDhJ
 YtdekFNEX+72h/N2GIpZwChIWSE3EcMPaLYnV8TwLUgh9YSust2sCLSBbJCjxOKc
 j78M8ept/bXZwTm/iJhEjtmqw0xl91rl011chCAua0iEpH3wxteDARmKABFHMQxl
 I3N/x/e/astgcSCNgpO4uDf9zEIRkNdzcHPzSMJ6C2Oo5W9XiZEekfw7WKj9nXfa
 G+eGckkAyCOQ/r2lZ9nA0ZUvQ2X6JISvxgohuaCNwTgsz3acTxbLnQK4YWHzQCQ=
 =iHi6
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2018-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

Saeed Mahameed says:

====================
mlx5-updates-2018-05-17

mlx5 core dirver updates for both net-next and rdma-next branches.

From Christophe JAILLET, first three patche to use kvfree where needed.

From: Or Gerlitz <ogerlitz@mellanox.com>

Next six patches from Roi and Co adds support for merged
sriov e-switch which comes to serve cases where both PFs, VFs set
on them and both uplinks are to be used in single v-switch SW model.
When merged e-switch is supported, the per-port e-switch is logically
merged into one e-switch that spans both physical ports and all the VFs.

This model allows to offload TC eswitch rules between VFs belonging
to different PFs (and hence have different eswitch affinity), it also
sets the some of the foundations needed for uplink LAG support.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:00:08 -04:00
Alexandre Belloni 64a2658b58 net: mscc: Add SPDX identifier
ocelot_qsys.h is missing the SPDX identfier, fix that.

Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Reviewed-by: Allan W. Nielsen <allan.nielsen@microsemi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:30:25 -04:00
Jose Abreu 61fac60a6a net: stmmac: Remove if condition by taking advantage of hwif return code
We can remove the if condition and check if return code is different
than -EINVAL, meaning callback is present.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:00:16 -04:00
Jose Abreu d2df9ea0ad net: stmmac: Let descriptor code get skbuff address
Stop using if conditions depending on the GMAC version for getting the
descriptor skbuff address and use instead a helper implemented in the
descriptor files.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:00:16 -04:00
Jose Abreu 357951cdf0 net: stmmac: Uniformize set_rx_owner()
Currently an if condition is used to select the correct callback to set
rx_onwer in descriptor. Lets keep this simple and always use the same
callback.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:00:16 -04:00
Jose Abreu f1565c6021 net: stmmac: Remove uneeded check for GMAC version in stmmac_xmit
We either have .enable_dma_transmission or .set_tx_tail_ptr in the HW
table callbacks, we can never have both so there is no need to check for
GMAC version.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:00:15 -04:00
Jose Abreu 24aaed0cc0 net: stmmac: Uniformize the use of dma_init_* callbacks
Instead of relying on the GMAC version for choosing if we need to use
dma_init or dma_init_{rx/tx}_chan callback, lets uniformize this and
always use the dma_init_{rx/tx}_chan callbacks.

While at it, fix the use of dma_init_chan callback, which shall be
called for as many channels as the max of rx/tx channels.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:00:15 -04:00
Jose Abreu 758d5c73e2 net: stmmac: Move PTP and MMC base address calculation to hwif.c
PTP and MMC modules base address can depend on the GMAC version. As this
is HW specific lets move this base address calculation to hwif.c. Also,
add an entry in the HW table so that we can specify the module offset.
This can later be extended to more modules, if deemed necessary.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:00:15 -04:00
Jose Abreu 63a550fc15 net: stmmac: Remove uneeded checks for GMAC version
With the introducion of callbacks check in hwif.h we only call the
callback if HW supports it so there is no longer need to check for GMAC
version.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:00:15 -04:00
Jose Abreu ab0204e35c net: stmmac: Uniformize the use of dma_{rx/tx}_mode callbacks
Instead of relying on the GMAC version for choosing if we need to use
dma_{rx/tx}_mode or just dma_mode callback lets uniformize this and
always use the dma_{rx/tx}_mode callbacks.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:00:15 -04:00
Jose Abreu 44c67f8559 net: stmmac: Let descriptor code clear the descriptor
Stop using if conditions depending on the GMAC version for clearing the
descriptor and use instead a helper implemented in the descriptor files.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:00:15 -04:00
Jose Abreu 6844171d5b net: stmmac: Let descriptor code set skbuff address
Stop using if conditions depending on the GMAC version for setting the
the descriptor skbuff address and use instead a helper implemented in
the descriptor files.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:00:14 -04:00
Jose Abreu 4ae0169fd1 net: stmmac: Do not keep rearming the coalesce timer in stmmac_xmit
This is cutting down performance. Once the timer is armed it should run
after the time expires for the first packet sent and not the last one.

After this change, running iperf, the performance gain is +/- 24%.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:00:14 -04:00
Jose Abreu 67e1c4068d net: stmmac: Enable OSP for GMAC4
This enables OSP (Operate on Second Packet) for GMAC4. The feature
allows DMA to fetch second descriptor while its still processing the
first one.

Running iperf, the performance gain is +/- 38%.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:00:14 -04:00
Or Gerlitz a228060a7c net/mlx5e: Add HW vport counters to representor ethtool stats
Currently the representor only report the SW (slow-path) traffic
counters.

Add packet/bytes reporting of the HW counters, which account for the
total amount of traffic that was handled by the vport, both slow and
fast (offloaded) paths. The newly exposed counters are named
vport_rx/tx_packets/bytes.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Adi Nissim <adin@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 17:48:54 -07:00
Or Gerlitz 8f8ae8953f net/mlx5e: Ignore attempts to offload multiple times a TC flow
For VF->VF and uplink->VF rules, the TC core (cls_api) attempts
to offload the same flow multiple times into the driver, b/c we
registered to the egdev callback.

Use the flow cookie to ignore attempts to add such flows, we can't
reject them (return error), b/c this will fail the offload attempt,
so we ignore that. We indentify wrong stat/del calls using the flow
ingress/egress flags, here we do return error to the core.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 17:48:54 -07:00
Or Gerlitz 655dc3d2b9 net/mlx5e: Use shared table for offloaded TC eswitch flows
Currently, each representor netdev use their own hash table to keep
the mapping from TC flow (f->cookie) to the driver offloaded instance.
The table is the one which originally was added for offloading TC NIC
(not eswitch) rules.

This scheme breaks when the core TC code calls us to add the same flow
twice, (e.g under egdev use case) since we don't spot that and offload
a 2nd flow into the HW with the wrong source vport.

As a pre-step to solve that, we move to use a single table which keeps
all offloaded TC eswitch flows. The table is located at the eswitch
uplink representor object.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 17:48:54 -07:00
Or Gerlitz 05866c8236 net/mlx5e: Prepare for shared table to keep TC eswitch flows
This is a refactoring step to be able and store the hash table which
keeps track of offloaded TC flows in a different location for NIC
vs e-switch rules.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 17:48:54 -07:00
Or Gerlitz 60bd4af814 net/mlx5e: Add ingress/egress indication for offloaded TC flows
When an e-switch TC rule is offloaded through the egdev (egress
device) mechanism, we treat this as egress, all other cases (NIC
and e-switch) are considred ingress.

This is preparation step that will allow us to  identify "wrong"
stat/del offload calls made by the TC core on egdev based flows and
ignore them.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 17:48:54 -07:00
Rabie Loulou b1d90e6bbd net/mlx5e: Offload TC eswitch rules for VFs belonging to different PFs
When the merged eswitch capability is supported, allow offloading rules
between VFs which belong to different PFs (and hence have different
eswitch affinity).

Signed-off-by: Rabie Loulou <rabiel@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Shahar Klein <shahark@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 17:48:54 -07:00
Saeed Mahameed 260ab7042e mlx5-updates-2018-05-17
mlx5 core dirver updates for both net-next and rdma-next branches.
 
 From Christophe JAILLET, first three patche to use kvfree where needed.
 
 From: Or Gerlitz <ogerlitz@mellanox.com>
 
 Next six patches from Roi and Co adds support for merged
 sriov e-switch which comes to serve cases where both PFs, VFs set
 on them and both uplinks are to be used in single v-switch SW model.
 When merged e-switch is supported, the per-port e-switch is logically
 merged into one e-switch that spans both physical ports and all the VFs.
 
 This model allows to offload TC eswitch rules between VFs belonging
 to different PFs (and hence have different eswitch affinity), it also
 sets the some of the foundations needed for uplink LAG support.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJa/fLEAAoJEEg/ir3gV/o+7jUH/3n5/Uw1LLt3TfeKArx6i0F1
 3G4U5B0ha03qiDqXprwhyQ3I6lgYmRBmjcxnqmvcqOAqO4/hSsjtTR+A/mgbEDhJ
 YtdekFNEX+72h/N2GIpZwChIWSE3EcMPaLYnV8TwLUgh9YSust2sCLSBbJCjxOKc
 j78M8ept/bXZwTm/iJhEjtmqw0xl91rl011chCAua0iEpH3wxteDARmKABFHMQxl
 I3N/x/e/astgcSCNgpO4uDf9zEIRkNdzcHPzSMJ6C2Oo5W9XiZEekfw7WKj9nXfa
 G+eGckkAyCOQ/r2lZ9nA0ZUvQ2X6JISvxgohuaCNwTgsz3acTxbLnQK4YWHzQCQ=
 =iHi6
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2018-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

mlx5-updates-2018-05-17

mlx5 core dirver updates for both net-next and rdma-next branches.

From Christophe JAILLET, first three patches to use kvfree where needed.

From: Or Gerlitz <ogerlitz@mellanox.com>

Next six patches from Roi and Co adds support for merged
sriov e-switch which comes to serve cases where both PFs, VFs set
on them and both uplinks are to be used in single v-switch SW model.
When merged e-switch is supported, the per-port e-switch is logically
merged into one e-switch that spans both physical ports and all the VFs.

This model allows to offload TC eswitch rules between VFs belonging
to different PFs (and hence have different eswitch affinity), it also
sets the some of the foundations needed for uplink LAG support.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 17:47:55 -07:00
Shahar Klein 10ff5359f8 net/mlx5e: Explicitly set source e-switch in offloaded TC rules
Set a specific source e-switch when setting a rule that matches on the
ingress port.

Signed-off-by: Shahar Klein <shahark@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 14:17:35 -07:00
Shahar Klein 3e99df8772 net/mlx5: Add source e-switch owner
The source e-switch owner allows a vport on one e-switch port be associated
with a rule defined on the second port e-switch.

The role of the source eswitch owner valid bit in the flow group is to
allow the firmware fail driver attempts to wild card the source eswitch
match field. If this bit is not set, the firmware ignores the source
eswitch owner field totally.

Signed-off-by: Shahar Klein <shahark@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 14:17:34 -07:00
Rabie Loulou 56e858df9f net/mlx5e: Explicitly set destination e-switch in FDB rules
Set a specific destination e-switch when setting a destination vport.

Signed-off-by: Rabie Loulou <rabiel@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Shahar Klein <shahark@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 14:17:34 -07:00
Shahar Klein b17f7fc10f net/mlx5: Add destination e-switch owner
The destination e-switch owner allows a rule in namespace of one e-switch
owner to point to a vport that is natively associated with another
e-switch owner.

Signed-off-by: Shahar Klein <shahark@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 14:17:34 -07:00
Shahar Klein 65360e5451 net/mlx5: Properly handle a vport destination when setting FTE
When creating FTE, properly distinguish between destination being vport
or tir. The previous code just worked accidentally b/c of both dest being
in the same offset within a union.

Signed-off-by: Shahar Klein <shahark@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 14:17:34 -07:00
Florian Fainelli 78cc6e7ef9 net: ethernet: freescale: Allow FEC with COMPILE_TEST
The Freescale FEC driver builds fine with COMPILE_TEST, so make that
possible.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 17:11:06 -04:00
Florian Fainelli 2652113ff0 net: ethernet: ti: Allow most drivers with COMPILE_TEST
Most of the TI drivers build just fine with COMPILE_TEST, cpmac (AR7) is
the exception because it uses a header file from
arch/mips/include/asm/mach-ar7/ar7.h and keystone netcp which requires
help from drivers/soc/ti/ for queue management helpers.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 17:11:06 -04:00
Manish Chopra 8a8633978b qede: Add build_skb() support.
This patch makes use of build_skb() throughout in driver's receieve
data path [HW gro flow and non HW gro flow]. With this, driver can
build skb directly from the page segments which are already mapped
to the hardware instead of allocating new SKB via netdev_alloc_skb()
and memcpy the data which is quite costly.

This really improves performance (keeping same or slight gain in rx
throughput) in terms of CPU utilization which is significantly reduced
[almost half] in non HW gro flow where for every incoming MTU sized
packet driver had to allocate skb, memcpy headers etc. Additionally
in that flow, it also gets rid of bunch of additional overheads
[eth_get_headlen() etc.] to split headers and data in the skb.

Tested with:
system: 2 sockets, 4 cores per socket, hyperthreading, 2x4x2=16 cores
iperf [server]: iperf -s
iperf [client]: iperf -c <server_ip> -t 500 -i 10 -P 32

HW GRO off – w/o build_skb(), throughput: 36.8 Gbits/sec

Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
Average:     all    0.59    0.00   32.93    0.00    0.00   43.07    0.00    0.00   23.42

HW GRO off - with build_skb(), throughput: 36.9 Gbits/sec

Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
Average:     all    0.70    0.00   31.70    0.00    0.00   25.68    0.00    0.00   41.92

HW GRO on - w/o build_skb(), throughput: 36.9 Gbits/sec

Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
Average:     all    0.86    0.00   24.14    0.00    0.00    6.59    0.00    0.00   68.41

HW GRO on - with build_skb(), throughput: 37.5 Gbits/sec

Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
Average:     all    0.87    0.00   23.75    0.00    0.00    6.19    0.00    0.00   69.19

Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 17:06:53 -04:00
David S. Miller 56a9a9e737 Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
10GbE Intel Wired LAN Driver Updates 2018-05-17

This series contains updates to ixgbe, ixgbevf and ice drivers.

Cathy Zhou resolves sparse warnings by using the force attribute.

Mauro S M Rodrigues fixes a bug where IRQs were not freed if a PCI error
recovery system opts to remove the device which causes
ixgbe_io_error_detected() to return PCI_ERS_RESULT_DISCONNECT before
calling ixgbe_close_suspend() which results in IRQs not freed and
crashing when the remove handler calls pci_disable_device().  Resolved
this by calling ixgbe_close_suspend() before evaluating the PCI channel
state.

Pavel Tatashin releases the rtnl_lock during the call to
ixgbe_close_suspend() to allow scaling if device_shutdown() is
multi-threaded.

Emil modifies ixgbe to not validate the MAC address during a reset,
unless the MAC was set on the host so that the VF will get a new MAC
address every time it reloads.  Also updates ixgbevf to set
hw->mac.perm_addr in order to retain the custom MAC on a reset.

Anirudh updates the ice NVM read/erase/update AQ commands to align with
the latest specification.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 17:02:57 -04:00
Jiri Pirko 3b734ff604 nfp: flower: fix error path during representor creation
Don't store repr pointer to reprs array until the representor is
successfully created. This avoids message about "representor
destruction" even when it was never created. Also it cleans-up the flow.
Also, check return value after port alloc.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:23:29 -04:00
Yan Markman 934e0f8330 net: mvpp2: print rx error with rate-limit
Prevent flood of RX error prints during heavy traffic with weak signal
in link by checking net_ratelimit() before using netdev_err().

Signed-off-by: Yan Markman <ymarkman@marvell.com>
[Antoine: small rework, commit message]
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:18:55 -04:00
Yan Markman 5b0ab2f41d net: mvpp2: set mac address does not require the stop/start sequence
Remove special stop/start handling from the set_mac_address callback.
All this special care is not needed, and can be removed. It also
simplifies the up/down status in the driver and helps avoiding possible
link status mismatch issues.

Signed-off-by: Yan Markman <ymarkman@marvell.com>
[Antoine: commit message]
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:18:54 -04:00
Yan Markman 914365f1c9 net: mvpp2: avoid checking for free aggregated descriptors twice
Avoid repeating the check for free aggregated descriptors when it
already failed at the beginning of the function.

Signed-off-by: Yan Markman <ymarkman@marvell.com>
[Antoine: commit message]
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:18:54 -04:00
Antoine Tenart a6fe31de86 net: mvpp2: 2500baseX support
This patch adds the 2500Base-X PHY mode support in the Marvell PPv2
driver. 2500Base-X is quite close to 1000Base-X and SGMII modes and uses
nearly the same code path.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:11:40 -04:00
Antoine Tenart d97c9f4ab0 net: mvpp2: 1000baseX support
This patch adds the 1000Base-X PHY mode support in the Marvell PPv2
driver. 1000Base-X is quite close the SGMII and uses nearly the same
code path.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:11:40 -04:00
Antoine Tenart 4bb0432628 net: mvpp2: phylink support
Convert the PPv2 driver to implement phylink helpers, and use phylink in
DT mode. The other mode supported is ACPI, which will need further work
in order to be entirely compatible with phylink.

The MAC and GoP configuration functions were completely moved to fit
into the phylink helpers. When a PHY is always present between the MAC
and the physical port, phylink only is used, but when this is not the
case (the MAC directly is connected to the physical port) the link IRQ
is used to detect changes in the link state and call phylink_mac_change.

The ACPI mode do not uses phylink as of now, and the changes shouldn't
impact its use.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:11:39 -04:00
Antoine Tenart dcd3e73ae7 net: mvpp2: align the ethtool ops definition
Cosmetic patch to align the ethtool functions to ops definitions. This
patch does not change in any way the driver's behaviour.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:11:39 -04:00
Ivan Khoronzhuk 9611d6d6e2 net: ethernet: ti: cpsw: disable mq feature for "AM33xx ES1.0" devices
The early versions of am33xx devices, related to ES1.0 SoC revision
have errata limiting mq support. That's the same errata as
commit 7da1160002 ("drivers: net: cpsw: add am335x errata workarround for
interrutps")

AM33xx Errata [1] Advisory 1.0.9
http://www.ti.com/lit/er/sprz360f/sprz360f.pdf

After additional investigation were found that drivers w/a is
propagated on all AM33xx SoCs and on DM814x. But the errata exists
only for ES1.0 of AM33xx family, limiting mq support for revisions
after ES1.0. So, disable mq support only for related SoCs and use
separate polls for revisions allowing mq.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 15:11:55 -04:00
Thomas Falcon 0718421389 ibmvnic: Fix statistics buffers memory leak
Move initialization of statistics buffers from ibmvnic_init function
into ibmvnic_probe. In the current state, ibmvnic_init will be called
again during a device reset, resulting in the allocation of new
buffers without freeing the old ones.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 14:57:19 -04:00
Thomas Falcon 134bbe7f21 ibmvnic: Fix non-fatal firmware error reset
It is not necessary to disable interrupt lines here during a reset
to handle a non-fatal firmware error. Move that call within the code
block that handles the other cases that do require interrupts to be
disabled and re-enabled.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 14:57:19 -04:00
Thomas Falcon 4cf2ddf3e3 ibmvnic: Free coherent DMA memory if FW map failed
If the firmware map fails for whatever reason, remember to free
up the memory after.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 14:57:19 -04:00
Anirudh Venkataramanan 43c89b1642 ice: Update NVM AQ command functions
This patch updates the NVM read/erase/update AQ commands to align with
the latest specification.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-17 09:14:09 -07:00
Emil Tantilov 6e7d0ba1e5 ixgbevf: fix MAC address changes through ixgbevf_set_mac()
Set hw->mac.perm_addr in ixgbevf_set_mac() in order to avoid losing the
custom MAC on reset. This can happen in the following case:

>ip link set $vf address $mac
>ethtool -r $vf

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-17 09:07:37 -07:00
Emil Tantilov a8d9bb3d44 ixgbe: force VF to grab new MAC on driver reload
Do not validate the MAC address during a reset, unless the MAC was set on
the host. This way the VF will get a new MAC address every time it reloads.

Remove the "no MAC address assigned" message since it will get spammed on
reset and it doesn't help much as the MAC on the VF is randomly generated.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-17 09:05:49 -07:00
Pavel Tatashin 6710f970d9 ixgbe: release lock for the duration of ixgbe_suspend_close()
Currently, during device_shutdown() ixgbe holds rtnl_lock for the duration
of lengthy ixgbe_close_suspend(). On machines with multiple ixgbe cards
this lock prevents scaling if device_shutdown() function is multi-threaded.

It is not necessary to hold this lock during ixgbe_close_suspend()
as it is not held when ixgbe_close() is called also during shutdown but for
kexec case.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-17 09:04:15 -07:00
Mauro S M Rodrigues b212d815e7 ixgbe/ixgbevf: Free IRQ when PCI error recovery removes the device
Since commit f7f37e7ff2 ("ixgbe: handle close/suspend race with
netif_device_detach/present") ixgbe_close_suspend is called, from
ixgbe_close, only if the device is present, i.e. if it isn't detached.
That exposed a situation where IRQs weren't freed if a PCI error
recovery system opts to remove the device. For such case the pci channel
state is set to pci_channel_io_perm_failure and ixgbe_io_error_detected
was returning PCI_ERS_RESULT_DISCONNECT before calling
ixgbe_close_suspend consequentially not freeing IRQ and crashing when
the remove handler calls pci_disable_device, hitting a BUG_ON at
free_msi_irqs, which asserts that there is no non-free IRQ associated
with the device to be removed:

BUG_ON(irq_has_action(entry->irq + i));

The issue is fixed by calling the ixgbe_close_suspend before evaluate
the pci channel state.

Reported-by: Naresh Bannoth <nbannoth@in.ibm.com>
Reported-by: Abdul Haleem <abdhalee@in.ibm.com>
Signed-off-by: Mauro S M Rodrigues <maurosr@linux.vnet.ibm.com>
Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-17 09:00:54 -07:00
Cathy Zhou 9cfbfa701b ixgbe: cleanup sparse warnings
Sparse complains valid conversions between restricted types, force
attribute is used to avoid those warnings.

Signed-off-by: Cathy Zhou <cathy.zhou@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-17 08:24:30 -07:00
Ariel Levkovich 71c6e8638c IB/mlx5: Add support for MPLS flow specification
This patch introduces support for the MPLS flow spec and
allows the creation of rules that are matching on the
MPLS label.

Applying the rule matching depends on the flow specs order and
the location of the MPLS in the spec list as there are different
configurations to be made in the device in the cases of MPLSoGRE
and MPLSoUDP vs. non-encapsulated MPLS.

Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-16 21:32:55 -06:00
David S. Miller b9f672af14 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-05-17

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Provide a new BPF helper for doing a FIB and neighbor lookup
   in the kernel tables from an XDP or tc BPF program. The helper
   provides a fast-path for forwarding packets. The API supports
   IPv4, IPv6 and MPLS protocols, but currently IPv4 and IPv6 are
   implemented in this initial work, from David (Ahern).

2) Just a tiny diff but huge feature enabled for nfp driver by
   extending the BPF offload beyond a pure host processing offload.
   Offloaded XDP programs are allowed to set the RX queue index and
   thus opening the door for defining a fully programmable RSS/n-tuple
   filter replacement. Once BPF decided on a queue already, the device
   data-path will skip the conventional RSS processing completely,
   from Jakub.

3) The original sockmap implementation was array based similar to
   devmap. However unlike devmap where an ifindex has a 1:1 mapping
   into the map there are use cases with sockets that need to be
   referenced using longer keys. Hence, sockhash map is added reusing
   as much of the sockmap code as possible, from John.

4) Introduce BTF ID. The ID is allocatd through an IDR similar as
   with BPF maps and progs. It also makes BTF accessible to user
   space via BPF_BTF_GET_FD_BY_ID and adds exposure of the BTF data
   through BPF_OBJ_GET_INFO_BY_FD, from Martin.

5) Enable BPF stackmap with build_id also in NMI context. Due to the
   up_read() of current->mm->mmap_sem build_id cannot be parsed.
   This work defers the up_read() via a per-cpu irq_work so that
   at least limited support can be enabled, from Song.

6) Various BPF JIT follow-up cleanups and fixups after the LD_ABS/LD_IND
   JIT conversion as well as implementation of an optimized 32/64 bit
   immediate load in the arm64 JIT that allows to reduce the number of
   emitted instructions; in case of tested real-world programs they
   were shrinking by three percent, from Daniel.

7) Add ifindex parameter to the libbpf loader in order to enable
   BPF offload support. Right now only iproute2 can load offloaded
   BPF and this will also enable libbpf for direct integration into
   other applications, from David (Beckett).

8) Convert the plain text documentation under Documentation/bpf/ into
   RST format since this is the appropriate standard the kernel is
   moving to for all documentation. Also add an overview README.rst,
   from Jesper.

9) Add __printf verification attribute to the bpf_verifier_vlog()
   helper. Though it uses va_list we can still allow gcc to check
   the format string, from Mathieu.

10) Fix a bash reference in the BPF selftest's Makefile. The '|& ...'
    is a bash 4.0+ feature which is not guaranteed to be available
    when calling out to shell, therefore use a more portable variant,
    from Joe.

11) Fix a 64 bit division in xdp_umem_reg() by using div_u64()
    instead of relying on the gcc built-in, from Björn.

12) Fix a sock hashmap kmalloc warning reported by syzbot when an
    overly large key size is used in hashmap then causing overflows
    in htab->elem_size. Reject bogus attr->key_size early in the
    sock_hash_alloc(), from Yonghong.

13) Ensure in BPF selftests when urandom_read is being linked that
    --build-id is always enabled so that test_stacktrace_build_id[_nmi]
    won't be failing, from Alexei.

14) Add bitsperlong.h as well as errno.h uapi headers into the tools
    header infrastructure which point to one of the arch specific
    uapi headers. This was needed in order to fix a build error on
    some systems for the BPF selftests, from Sirio.

15) Allow for short options to be used in the xdp_monitor BPF sample
    code. And also a bpf.h tools uapi header sync in order to fix a
    selftest build failure. Both from Prashant.

16) More formally clarify the meaning of ID in the direct packet access
    section of the BPF documentation, from Wang.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 22:47:11 -04:00
Christophe JAILLET e574978ae5 net/mlx5: Eswitch, Use 'kvfree()' for memory allocated by 'kvzalloc()'
When 'kvzalloc()' is used to allocate memory, 'kvfree()' must be used to
free it.

Fixes: fed9ce22bf ("net/mlx5: E-Switch, Add API to create vport rx rules")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-16 17:49:01 -07:00
Christophe JAILLET a5898e97f0 net/mlx5: Vport, Use 'kvfree()' for memory allocated by 'kvzalloc()'
When 'kvzalloc()' is used to allocate memory, 'kvfree()' must be used to
free it.

Fixes: 9efa752545 ("net/mlx5_core: Introduce access functions to query vport RoCE fields")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-16 17:48:17 -07:00
Rahul Lakkireddy 8e725f7caa cxgb4: update LE-TCAM collection for T6
For T6, clip table is separated from main TCAM. So, update LE-TCAM
collection logic to collect clip table TCAM as well. IPv6 takes
4 entries in clip table TCAM compared to 2 entries in main TCAM.

Also, in case of errors, keep LE-TCAM collected so far and set the
status to partial dump.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 15:01:30 -04:00
Michal Kalderon 490068deae qed: Fix LL2 race during connection terminate
Stress on qedi/qedr load unload lead to list_del corruption.
This is due to ll2 connection terminate freeing resources without
verifying that no more ll2 processing will occur.

This patch unregisters the ll2 status block before terminating
the connection to assure this race does not occur.

Fixes: 1d6cff4fca ("qed: Add iSCSI out of order packet handling")
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 14:52:23 -04:00
Michal Kalderon ffd2c0d127 qed: Fix possibility of list corruption during rmmod flows
The ll2 flows of flushing the txq/rxq need to be synchronized with the
regular fp processing. Caused list corruption during load/unload stress
tests.

Fixes: 0a7fb11c23 ("qed: Add Light L2 support")
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 14:52:23 -04:00
Michal Kalderon f9bcd60274 qed: LL2 flush isles when connection is closed
Driver should free all pending isles once it gets a FLUSH cqe from FW.
Part of iSCSI out of order flow.

Fixes: 1d6cff4fca ("qed: Add iSCSI out of order packet handling")
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 14:52:23 -04:00
Michal Kalderon fc16f56b7b qed: Fix LL2 race during connection terminate
Stress on qedi/qedr load unload lead to list_del corruption.
This is due to ll2 connection terminate freeing resources without
verifying that no more ll2 processing will occur.

This patch unregisters the ll2 status block before terminating
the connection to assure this race does not occur.

Fixes: 1d6cff4fca ("qed: Add iSCSI out of order packet handling")
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 14:49:01 -04:00
Michal Kalderon 6291c608a9 qed: Fix possibility of list corruption during rmmod flows
The ll2 flows of flushing the txq/rxq need to be synchronized with the
regular fp processing. Caused list corruption during load/unload stress
tests.

Fixes: 0a7fb11c23 ("qed: Add Light L2 support")
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 14:49:01 -04:00
Michal Kalderon 974f6c046a qed: LL2 flush isles when connection is closed
Driver should free all pending isles once it gets a FLUSH cqe from FW.
Part of iSCSI out of order flow.

Fixes: 1d6cff4fca ("qed: Add iSCSI out of order packet handling")
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 14:49:00 -04:00
YueHaibing 76e597eb15 net: ethoc: Remove useless test before clk_disable_unprepare
clk_disable_unprepare() already checks that the clock pointer is valid.
No need to test it before calling it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 14:44:29 -04:00
YueHaibing 93120eba48 net: stmmac: Remove useless test before clk_disable_unprepare
clk_disable_unprepare() already checks that the clock pointer is valid.
No need to test it before calling it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 14:43:54 -04:00
Geert Uytterhoeven e49ac9679e net: 8390: ne: Fix accidentally removed RBTX4927 support
The configuration settings for RBTX4927 were accidentally removed,
leading to a silently broken network interface.

Re-add the missing settings to fix this.

Fixes: 8eb97ff5a4 ("net: 8390: remove m32r specific bits")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 14:38:54 -04:00
Hemanth Puranik 9e6881d366 net: qcom/emac: Encapsulate sgmii ops under one structure
This patch introduces ops structure for sgmii, This by ensures that
we do not need dummy functions in case of emulation platforms.

Signed-off-by: Hemanth Puranik <hpuranik@codeaurora.org>
Acked-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 14:33:27 -04:00
Subash Abhinov Kasiviswanathan 721ce0f644 net: qualcomm: rmnet: Remove redundant command check
The command packet size is already checked once in
rmnet_map_deaggregate() for the header, packet and trailer size, so
this additional check is not needed.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 14:23:04 -04:00
Subash Abhinov Kasiviswanathan bbde32d38b net: qualcomm: rmnet: Add support for ethtool private stats
Add ethtool private stats handler to debug the handling of packets
with checksum offload header / trailer. This allows to keep track of
the number of packets for which hardware computes the checksum and
counts and reasons where checksum computation was skipped in hardware
and was done in the network stack.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 14:23:04 -04:00
Subash Abhinov Kasiviswanathan 1eece799d3 net: qualcomm: rmnet: Capture all drops in transmit path
Packets in transmit path could potentially be dropped if there were
errors while adding the MAP header or the checksum header.
Increment the tx_drops stats in these cases.

Additionally, refactor the code to free the packet and increment
the tx_drops stat under a single label.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 14:23:04 -04:00
Florian Fainelli 00e798c7d1 drivers: net: Remove device_node checks with of_mdiobus_register()
A number of drivers have the following pattern:

if (np)
	of_mdiobus_register()
else
	mdiobus_register()

which the implementation of of_mdiobus_register() now takes care of.
Remove that pattern in drivers that strictly adhere to it.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Fugang Duan <fugang.duan@nxp.com>
Reviewed-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 14:20:36 -04:00
Grygorii Strashko c6213eb1ae net: ethernet: ti: cpsw-phy-sel: check bus_find_device() ret value
This fixes klockworks warnings: Pointer 'dev' returned from call to
function 'bus_find_device' at line 179 may be NULL and will be dereferenced
at line 181.

    cpsw-phy-sel.c:179: 'dev' is assigned the return value from function 'bus_find_device'.
    bus.c:342: 'bus_find_device' explicitly returns a NULL value.
    cpsw-phy-sel.c:181: 'dev' is dereferenced by passing argument 1 to function 'dev_get_drvdata'.
    device.h:1024: 'dev' is passed to function 'dev_get_drvdata'.
    device.h:1026: 'dev' is explicitly dereferenced.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
[nsekhar@ti.com: add an error message, fix return path]
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 14:13:13 -04:00
David S. Miller 7d6541fba1 mlx5e-updates-2018-05-14
Misc update for mlx5e netdevice driver
 
 From Gal Pressman:
  - Remove MLX5E_TEST_BIT macros and use test_bit instead
  - Use __set_bit when possible
 
 From Eran Ben Elisha:
   - Improve debug print on initial RX posting timeout
 
 From Or Gerlitz:
  - Support offloaded TC flows with no matches on headers
  - mlx5e TC cleanups
 
 Trivial cleanups From Roi, Tariq and Saeed:
   - Use bool as return type for mlx5e_xdp_handle
   - Use u8 instead of int for LRO number of segments
   - Skip redundant checks when providing NUD lastuse feedback
   - Remove redundant vport context vlan update
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJa+gseAAoJEEg/ir3gV/o+m54H/1dMnavEL5kBr4KlLMO1AYrI
 PL+BeOyVyOkXE5h4IJm070SPukV79SoC3ntjj3KsRBw65WhLmo0Lw10GeDwXove1
 BZ+mvNG7kvTZgXG1LfR2R6wvFrum2bcj1h5A4+/BQA9Zur0PusWbZvQ+5s3vQRRX
 TqapkcitAKPxeWPm8YOGsKxiVfnwVgX1C/gE2Jr05aV8veuWF2QlmrAkm38oOasG
 5rZ5fM3YQwaXJrBRg2uI5merIi5sU0GTVbMCdQ8gcS3YHVXDNJxOm60QnT6cVGPi
 eZzIPrU7iksWvYdZ7AR3mqnuoobbgBbbPu36qTyQNJJR+28JJpaTVshnpEhvg1Y=
 =9qrA
 -----END PGP SIGNATURE-----

Merge tag 'mlx5e-updates-2018-05-14' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5e-updates-2018-05-14

Misc update for mlx5e netdevice driver

From Gal Pressman:
 - Remove MLX5E_TEST_BIT macros and use test_bit instead
 - Use __set_bit when possible

From Eran Ben Elisha:
  - Improve debug print on initial RX posting timeout

From Or Gerlitz:
 - Support offloaded TC flows with no matches on headers
 - mlx5e TC cleanups

Trivial cleanups From Roi, Tariq and Saeed:
  - Use bool as return type for mlx5e_xdp_handle
  - Use u8 instead of int for LRO number of segments
  - Skip redundant checks when providing NUD lastuse feedback
  - Remove redundant vport context vlan update
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 11:36:39 -04:00
Fuyun Liang 6a814413eb net: hns3: Fixes the missing PCI iounmap for various legs
We call pcim_iomap in hclge_pci_init, pcim_iounmap should be called
in error handle of hclge_init_ae_dev.

We call pcim_iomap in hclge_pci_init, but do not call pcim_iounmap in
hclge_pci_uninit. When we remove the hclge.ko and insert it again, a
problem that pci can not map will happen. pcim_iounmap need to be called
in hclge_pci_uninit.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 11:33:08 -04:00
Peng Li fa8d82e853 net: hns3: Add support of .sriov_configure in HNS3 driver
As HNS3 driver will enable SRIOV default and enable all VFs the
HW support, if PF and VF driver compiled to kernel, VF driver
will work on host default, it is not right.

This patch adds support for hns3_driver.sriov_configure to support
user configs the VF_num, and do not enable sriov default.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Suggested-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 11:33:08 -04:00
Yunsheng Lin be8d8cdb8e net: hns3: Fix for fiber link up problem
When hclge_ae_start is called, hdev->hw.mac.link may be set
to one after up/down multi-times, which does not correspond to
the link state of netdev when the netdev is up.

This fixes it by setting hdev->hw.mac.link to zero when
hclge_ae_start is called.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 11:33:08 -04:00
Yunsheng Lin 67bf2541f4 net: hns3: Fixes the back pressure setting when sriov is enabled
When sriov is enabled, the Qset and tc mapping is not longer one
to one relation.

This patch fixes it by mapping all pf and vf's Qset to tc.

Fixes: 848440544b ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 11:33:08 -04:00
Fuyun Liang 0c698257c7 net: hns3: Change return value in hnae3_register_client
A client includes many client instance. Just like ae_algo, Initializing
client instance failed does not represent registering client failed.
The action of registering client just is adding client to the client
list and the result always is true. This patch changes the return
value of hnae3_register_client form a variable value to a fixed value,
makes the function always return ok.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 11:33:08 -04:00
Fuyun Liang 854cf33a63 net: hns3: Change return type of hnae3_register_ae_algo
The ae_algo is used by many ae_devs. It is not only belong to just a
ae_dev. Initializing ae_dev failed does not represent registering ae_algo
failed. Because the action of registering ae_algo just is adding ae_algo
to the ae_algo list and it is always is true, it make no sense to define
return type as int.

This patch changes the return type of hnae3_register_ae_algo from int to
void.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 11:33:07 -04:00
Fuyun Liang 50fbc237b7 net: hns3: Change return type of hnae3_register_ae_dev
If hclge.ko has not been inserted, the value of ret always is zero
in hnae3_register_ae_dev. If hclge.ko has been inserted, the value
of ret is zero or non zero. Different execution ways have different
results. It is confusing.

The ae_dev which is initialized failed can be reinitialized when we
remove hclge.ko and insert it again. For the case initializing client
instance, it is just like the case initializing ae_dev. The main function
of hnae3_register_ae_dev is adding the ae_dev to ad_dev list. Because
adding ae_dev is always ok, we does not need to return any in this
function.

This patch changes the return type of hnae3_register_ae_dev from int
to void.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 11:33:07 -04:00
Fuyun Liang e3afa96365 net: hns3: Add a check for client instance init state
If the client instance is initializd failed, we do not need to uninit it.
This patch adds a state check to check init state of client instance.

Fixes: 38caee9d3e ("net: hns3: Add support of the HNAE3 framework")
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 11:33:07 -04:00
Fuyun Liang 3e249d3bed net: hns3: Fix for the null pointer problem occurring when initializing ae_dev failed
When initializing ae_dev failed during loading hclge.ko, the drvdata will
be set to null. When removing hns3.ko, we get a null ae_dev. It causes the
null pointer problem.

This patch removes pci_set_drvdata from error handle of hclge_init_ae_dev
to fix the bug, since pci_set_drvdata has been called in hns3_remove.
Also, we do not need to uninit the ae_dev which is not initialized. And
it may be the one which is initialized failed.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 11:33:07 -04:00
Fuyun Liang 2312e050f4 net: hns3: Fix for deadlock problem occurring when unregistering ae_algo
When hnae3_unregister_ae_algo is called by PF, pci_disable_sriov is
called. And then, hns3_remove is called by VF. We get deadlocked in
this case.

Since VF pci device is dependent on PF pci device, When PF pci device
is removed, VF pci device must be removed. Also, To solve the deadlock
problem, VF pci device should be removed before PF pci device is removed.

This patch moves pci_enable/disable_sriov from hclge to hns3 to solve
the deadlock problem.

Also, we do not need to return EPROBE_DEFER in hnae3_register_ae_dev,
because SRIOV is no longer enabled in the context calling
hnae3_register_ae_dev. Mutex_trylock can be replaced with mutex_lock.

Fixes: 424eb834a9 ("net: hns3: Unified HNS3 {VF|PF} Ethernet Driver for hip08 SoC")
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16 11:33:07 -04:00
Alexandre Belloni a556c76adc net: mscc: Add initial Ocelot switch support
Add a driver for Microsemi Ocelot Ethernet switch support.

This makes two modules:
mscc_ocelot_common handles all the common features that doesn't depend on
how the switch is integrated in the SoC. Currently, it handles offloading
bridging to the hardware. ocelot_io.c handles register accesses. This is
unfortunately needed because the register layout is packed and then depends
on the number of ports available on the switch. The register definition
files are automatically generated.

ocelot_board handles the switch integration on the SoC and on the board.

Frame injection and extraction to/from the CPU port is currently done using
register accesses which is quite slow. DMA is possible but the port is not
able to absorb the whole switch bandwidth.

Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-15 16:41:15 -04:00
Kumar Sanghvi 98f3697f8d cxgb4: add tc flower match support for tunnel VNI
Adds support for matching flows based on tunnel VNI value.
Introduces fw APIs for allocating/removing MPS entries related
to encapsulation. And uses the same while adding/deleting filters
for offloading flows based on tunnel VNI match.

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-14 22:50:15 -04:00
Kumar Sanghvi 849a742c59 cxgb4: Correct ntuple mask validation for hash filters
Earlier code of doing bitwise AND with field width bits was wrong.
Instead, simplify code to calculate ntuple_mask based on supplied
fields and then compare with mask configured in hw - which is the
correct and simpler way to validate ntuple mask.

Fixes: 3eb8b62d5a ("cxgb4: add support to create hash-filters via tc-flower offload")
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-14 22:41:29 -04:00
Gal Pressman 0e5c04f6b5 net/mlx5e: Remove MLX5E_TEST_BIT macro
MLX5E_TEST_BIT macro is the same as the already existent test_bit,
remove it and replace all usages.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-14 15:10:21 -07:00
Gal Pressman bfbe205753 net/mlx5e: Use test bit in en accel xmit flow
Replace (mask & bit) check with test_bit.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-14 15:10:21 -07:00
Gal Pressman af5a6c9365 net/mlx5e: Use __set_bit for adaptive-moderation bit in RQ state
Make the code more clear by replacing the existing code with __set_bit.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-14 15:10:21 -07:00
Eran Ben Elisha 1e7477ae8b net/mlx5e: Report all channels with min RX WQEs timeout
Report all channels which got timeout on posting the minimal number of
RX WQEs and not only the first one. Avoid busy wait on every channel,
when one of the RQs check got timeout, poll once for the remaining RQs.

In addition, add channel index to log when failed to get min RX WQEs
This info is needed in order to debug in case of dysfunctional channel.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-14 15:10:21 -07:00
Or Gerlitz 38aa51c134 net/mlx5e: Support offloaded TC flows with no matches on headers
For example:
    tc filter add dev ens2f0_0 parent ffff: flower skip_sw action drop

Note that for eswitch flows, we still always match on the source port.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-14 15:10:21 -07:00
Or Gerlitz d708f90298 net/mlx5e: Get the required HW match level while parsing TC flow matches
Introduce levels of matching on headers of offloaded flows
(none, L2, L3, L4) that follow the inline mode levels.

This is pre-step for us to offload flows without any
matches on headers.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-14 15:10:21 -07:00
Or Gerlitz 547829004c net/mlx5e: Properly order min inline mode setup while parsing TC matches
Set the initial value to none instead of L2, and set to L2
where the previous initial value was assumed. Make sure to
parse L2 matches before L3 matches and L3 before L4.

This is a pre-step to get the match level for more purposes
other than the validating the needed vs. actual inline level.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-14 15:10:21 -07:00
Or Gerlitz 1cab1cd74b net/mlx5e: Use local actions var while processing offloaded TC flow actions
Use local actions variable while parsing the actions of offloaded TC flow.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-14 15:10:21 -07:00
Or Gerlitz 31c8eba5e8 net/mlx5e: Return success when TC offloaded fdb actions parsed ok
Reaching here, means we didn't err anywhere, so lets just
return success.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-14 15:10:21 -07:00
Or Gerlitz c180f67534 net/mlx5e: Avoid redundant zeroing of offloaded TC flow attributes
This is not needed as the attributes are zeroed out on allocation.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-14 15:10:21 -07:00
Or Gerlitz b3a433de7b net/mlx5e: Clean static checker complaints on TC offload and VF reps code
Clean warning/check complaints made by checkpatch on en_{tc,rep}.c

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-14 15:10:21 -07:00
Or Gerlitz 4601266095 net/mlx5e: Remove double defined DMAC header re-write element
The firmware DMAC_47_16 header re-write token was defined twice,
clean it up.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-14 15:10:21 -07:00
Tariq Toukan efb6d7a20c net/mlx5e: Use bool as return type for mlx5e_xdp_handle
Function returns boolean values, use bool instead of int.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-14 15:10:21 -07:00
Tariq Toukan bd206fd52e net/mlx5e: Use u8 instead of int for LRO number of segments
Range of LRO number of segments fits in u8.
Also, bring initialization and declaration together to
save code.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-14 15:10:21 -07:00
Roi Dayan e36d4810f2 net/mlx5e: Skip redundant checks when providing NUD lastuse feedback
It's redundant to continue the loop if we found one flow whose lastuse value
being newer than the last one we reported, since this is enough for us to
trigger a NUD update (neigh_event_send()).

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-14 15:10:21 -07:00
Saeed Mahameed b8c931ba3c net/mlx5e: Remove redundant vport context vlan update
In delete vlan flow an extra call to mlx5e_vport_context_update_vlans
was added by mistake, remove it.

Fixes: 86d722ad2c ("net/mlx5: Use flow steering infrastructure for mlx5_en")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Gal Pressman <galp@mellanox.com>
2018-05-14 15:10:22 -07:00
Arjun Vynipadath 7cfac88166 cxgb4: do not fail vf instatiation in slave mode
We no longer require a check for cxgb4 to be MASTER
when configuring SRIOV, It was required when we had
module parameter to instantiate vf.

Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-14 16:42:09 -04:00